A FIRST COURSE IN 
MATHEMATICAL LOGIC 
AND SET THEORY 


CONTENTS 


Preface Xili 
Acknowledgments XV 
List of Symbols XVii 
1 Propositional Logic 1 
1.1 Symbolic Logic 1 
Propositions 2 

Propositional Forms 5 

Interpreting Propositional Forms 7 

Valuations and Truth Tables 10 

1.2 Inference 19 
Semantics 21 

Syntactics 23 

1.3 Replacement 31 
Semantics 31 

Syntactics 34 


vii 


viii 


3 


CONTENTS 


1.4 


1.5 


Proof Methods 
Deduction Theorem 
Direct Proof 
Indirect Proof 

The Three Properties 
Consistency 
Soundness 
Completeness 


First-Order Logic 


2.1 


2.2 


2.3 


2.4 


Languages 
Predicates 
Alphabets 
Terms 
Formulas 
Substitution 
Terms 
Free Variables 
Formulas 
Syntactics 
Quantifier Negation 


Proofs with Universal Formulas 
Proofs with Existential Formulas 


Proof Methods 
Universal Proofs 
Existential Proofs 
Multiple Quantifiers 
Counterexamples 
Direct Proof 
Existence and Uniqueness 
Indirect Proof 
Biconditional Proof 
Proof of Disunctions 
Proof by Cases 


Set Theory 


3.1 


Sets and Elements 
Rosters 


40 
40 
44 
47 
51 
51 
55 
58 


63 


63 
63 
67 
70 
71 
75 
75 
76 
78 
85 
85 
87 
90 
96 
97 
99 
100 
102 
103 
104 
105 
107 
111 
112 


117 


117 
118 


3.2 


3.3 


3.4 


CONTENTS 


Famous Sets 
Abstraction 
Set Operations 
Union and Intersection 
Set Difference 
Cartesian Products 
Order of Operations 
Sets within Sets 
Subsets 
Equality 
Families of Sets 
Power Set 
Union and Intersection 
Disjoint and Pairwise Disjoint 


Relations and Functions 


4.1 


4.2 


4.3 


44 


4.5 


4.6 


Relations 
Composition 
Inverses 
Equivalence Relations 
Equivalence Classes 
Partitions 
Partial Orders 
Bounds 
Comparable and Compatible Elements 
Well-Ordered Sets 
Functions 
Equality 
Composition 
Restrictions and Extensions 
Binary Operations 
Injections and Surjections 
Injections 
Surjections 
Biections 
Order Isomorphims 
Images and Inverse Images 


ix 


119 
121 
126 
126 
127 
130 
132 
135 
135 
137 
148 
151 
151 
155 


161 


161 
163 
165 
168 
171 
172 
177 
180 
181 
183 
189 
194 
195 
196 
197 
203 
205 
208 
211 
212 
216 


CONTENTS 


Axiomatic Set Theory 


5.1 Axioms 


Equality Axioms 

Existence and Uniqueness Axioms 
Construction Axioms 
Replacement Axioms 

Axiom of Choice 


Axiom of Regularity 


5.2 Natural Numbers 


Order 
Recursion 


Arithmetic 


5.3 Integers and Rational Numbers 


Integers 
Rational Numbers 
Actual Numbers 


5.4 Mathematical Induction 


Combinatorics 


Euclid’s Lemma 


5.5 Strong Induction 


Fibonacci Sequence 


Unique Factorization 


5.6 Real Numbers 


Dedekind Cuts 
Arithmetic 


Complex Numbers 


Ordinals and Cardinals 
6.1 Ordinal Numbers 


Ordinals 

Classification 
Burali-Forti and Hartogs 
Transfinite Recursion 


6.2. Equinumerosity 


Order 


Diagonalization 


6.3 Cardinal Numbers 


Finite Sets 


225 


225 
226 
227 
228 
229 
230 
234 
237 
239 
242 
243 
249 
250 
253 
256 
257 
260 
264 
268 
268 
271 
274 
275 
278 
280 


283 


283 
286 
290 
292 
293 
298 
300 
303 
307 
308 


6.4 


6.5 


CONTENTS 


Countable Sets 
Alephs 
Arithmetic 
Ordinals 
Cardinals 
Large Cardinals 
Regular and Singular Cardinals 
Inaccessible Cardinals 


Models 


7A 


7.2 


73 


74 


a) 


First-Order Semantics 
Satisfaction 
Groups 
Consequence 
Coincidence 
Rings 
Substructures 
Subgroups 
Subrings 
Ideals 
Homomorphisms 
Isomorphisms 
Elementary Equivalence 
Elementary Substructures 
The Three Properties Revisited 
Consistency 
Soundness 
Completeness 
Models of Different Cardinalities 
Peano Arithmetic 
Compactness Theorem 
Léwenheim-Skolem Theorems 
The von Neumann Hierarchy 


Appendix: Alphabets 


References 


Index 


xi 


310 
313 
316 
316 
322 
327 
328 
331 


333 


333 
335 
340 
346 
348 
353 
361 
363 
366 
368 
374 
380 
384 
388 
394 
394 
397 
399 
409 
410 
414 
415 
417 


427 


429 


435 


PREFACE 


This book is inspired by The Structure of Proof: With Logic and Set Theory published 
by Prentice Hall in 2002. My motivation for that text was to use symbolic logic as a 
means by which to learn how to write proofs. The purpose of this book is to present 
mathematical logic and set theory to prepare the reader for more advanced courses that 
deal with these subjects either directly or indirectly. It does this by starting with propo- 
sitional logic and first-order logic with sections dedicated to the connection of logic 
to proof-writing. Building on this, set theory is developed using first-order formulas. 
Set operations, subsets, equality, and families of sets are covered followed by relations 
and functions. The axioms of set theory are introduced next, and then sets of num- 
bers are constructed. Finite numbers, such as the natural numbers and the integers, are 
defined first. All of these numbers are actually sets constructed so that they resemble 
the numbers that are their namesakes. Then, the infinite ordinal and cardinal numbers 
appear. The last chapter of the book is an introduction to model theory, which includes 
applications to abstract algebra and the proofs of the completeness and compactness 
theorems. The text concludes with a note on Gédel’s incompleteness theorems. 


MICHAEL L. O’ LEARY 


Glen Ellyn, Illinois 
July 2015 


xiii 


ACKNOWLEDGMENTS 


Thanks are due to Susanne Steitz-Filler, Senior Editor at Wiley, for her support of this 
project. Thanks are also due to Sari Friedman and Katrina Maceda, both at Wiley, for 
their work in producing this book. Lastly, I wish to thank the anonymous reviewer 
whose comments proved beneficial. 

On a personal note, I would like to express my gratitude to my parents for their 
continued caring and support; to my brother and his wife, who will make sure my 
niece learns her math; to my dissertation advisor, Paul Eklof, who taught me both set 
theory and model theory; to Robert Meyer, who introduced me to symbol logic; to 
David Elfman, who taught me about logic through programming on an Apple IJ; and 
to my wife, Barb, whose love and patience supported me as I finished this book. 


XV 


SYMBOLS 


Symbol Page(s) 


21, 336, 346, 352 
23, 338 
24 


?TYVRI< tl<> 4 jj 


< 
wc SOx T 
a5 
lon 
& 
en £& 
NN > 


Symbol Page(s) 
= 68, 226 
A 68 
ST 68 
NT 69 
AR 69 
AR’ 69 
GR 69 
RI 69 
OF 69 
TERMS(A) 70 
S 70 
L(S) 73 
£ 75 

x 
P(x) 80 
a 88 
| 98 
c| 116, 307 


A First Course in Mathematical Logic and Set Theory, First Edition. Michael L. O’ Leary. 
© 2016 John Wiley & Sons, Inc. Published 2016 by John Wiley & Sons, Inc. 


xvii 


xviii LIST OF SYMBOLS 


Symbol 


ZBAFANZVAMN 


—_— 

.) 
ao 

reac 


Page(s) 
118 
118 
118 
119 
119 
119 
119 
119 
120 
120 
120 
122 
123 
123 
126 
126 
127 
128 
130 
130 
131 

135, 361 
135 
136 
140 
151 
152 
152 
161 
162 
162 
162 
163, 195 
166, 203 
168 
168 
170, 387 
170 
171 
171 
172 
172 
178 
178 


ThONZS 


a 23 


seg(a, p) 
<a4 


ZEEE AAIARR x 


Page(s) 


197, 


212, 


237, 


178 
183 
190 
190 
191 
192 
193 
194 
352 
198 
380 
216 
217 
232 
235 
235 
235 
236 
290 
238 
242 
250 
254 
262 
263 
269 
271 
275 
276 
278 
278 
281 
281 
286 
294 
298 
298 
298 
300 
300 
300 
304 
313 
313 


Page(s) 
313 
314 
316 
328 
333 
333 
336 
340 
345 
345 
345 
352 
355 
363 
372 
375 


LIST OF SYMBOLS 


Page(s) 
375 
379 
389 
393 
404 
408 
410 
410 
410 
411 
411 
417 
419 
419 
420 


xix 


CHAPTER 1 


PROPOSITIONAL LOGIC 


1.1 SYMBOLIC LOGIC 


Let us define mathematics as the study of number and space. Although representations 
can be found in the physical world, the subject of mathematics is not physical. Instead, 
mathematical objects are abstract, such as equations in algebra or points and lines in 
geometry. They are found only as ideas in minds. These ideas sometimes lead to the 
discovery of other ideas that do not manifest themselves in the physical world as when 
studying various magnitudes of infinity, while others lead to the creation of tangible 
objects, such as bridges or computers. 

Let us define logic as the study of arguments. In other words, logic attempts to codify 
what counts as legitimate means by which to draw conclusions from given information. 
There are many variations of logic, but they all can be classified into one of two types. 
There is inductive logic in which if the argument is good, the conclusion will probably 
follow from the hypotheses. This is because inductive logic rests on evidence and 
observation, so there can never be complete certainty whether the conclusions reached 
do indeed describe the universe. An example of an inductive argument is: 


A First Course in Mathematical Logic and Set Theory, First Edition. Michael L. O’ Leary. 
© 2016 John Wiley & Sons, Inc. Published 2016 by John Wiley & Sons, Inc. 


2 Chapter 1 PROPOSITIONAL LOGIC 


A red sky in the morning means that a storm is coming. 
We see a red sky this morning. 
Therefore, there will be a storm today. 


Whether this is a trust-worthy argument or not rests on the strength of the predictive 
abilities of a red sky, and we know about that by past observations. Thus, the argument 
is inductive. The other type is deductive logic. Here the methods yield conclusions 
with complete certainty, provided, of course, that no errors in reasoning were made. 
An example of a deductive argument is: 


All geometers are mathematicians. 
Euclid is a geometer. 
Therefore, Euclid is a mathematician. 


Whether Euclid refers to the author of the Elements or is Mr. Euclid from down the 
street is irrelevant. The argument works because the third sentence must follow from 
the first two. 

As anyone who has solved an equation or written a proof can attest, deductive logic 
is the realm of the mathematician. This is not to say that there are not other aspects to 
the discovery of mathematical results, such as drawing conclusions from diagrams or 
patterns, using computational software, or simply making a lucky guess, but it is to say 
that to accept a mathematical statement requires the production of a deductive proof of 
that statement. For example, in elementary algebra, we know that given 


2x-5= 11, 


we can conclude 


') 
sa 
ll 
a 


and then 
x=3. 


As each of the steps is legal, it is certain that the conclusion of x = 3 follows. In 
geometry, we can write a two-column proof that shows that 


ZB2=ZD 
is guaranteed to follow from 


ABCD is a parallelogram. 


The study of these types of arguments, those that are deductive and mathematical in 
content, is called mathematical logic. 


Propositions 


To study arguments, one must first study sentences because they are the main parts of 
arguments. However, not just any type of sentence will do. Consider 


all squares are rectangles. 


Section 1.1 SYMBOLIC LOGIC 3 


The purpose of this sentence is to affirm that things called squares also belong to the 
category of things called rectangles. In this case, the assertion made by the sentence is 
correct. Also, consider, 


circles are not round. 


This sentence denies that things called circles have the property of being round. This 
denial is incorrect. If a sentence asserts or denies accurately, the sentence is true, but if 
it asserts or denies inaccurately, the sentence is false. These are the only truth values 
that a sentence can have, and if a sentence has one, it does not have the other. As 
arguments intend to draw true conclusions from presumably true given sentences, we 
limit the sentences that we study to only those with a truth value. This leads us to our 
first definition. 


@ DEFINITION 1.1.1 
A sentence that is either true or false is called a proposition. 


Not all sentences are propositions, however. Questions, exclamations, commands, 
or self-contradictory sentences like the following examples can neither be asserted nor 
be denied. 


¢ Is mathematics logic? 
e Hey there! 


¢ Do not panic. 


This sentence is false. 


Sometimes it is unclear whether a sentence identifies a proposition. This can be 
due to factors such as imprecision or poor sentence structure. Another example is the 
sentence 


it is a triangle. 


Is this true or false? It is impossible to know because, unlike the other words of the 
sentence, the meaning of the word it is not determined. In this sentence, the word it is 
acting like a variable as in x + 2 = 5. As the value of x is undetermined, the sentence 
x+2 = 5 is neither true nor false. However, if x represents a particular value, we could 
make a determination. For example, if x = 3, the sentence is true, and if x = 10, the 
sentence is false. Likewise, if it refers to a particular object, then it is a triangle would 
identify a proposition. 

There are two types of propositions. An atom is a proposition that is not comprised 
of other propositions. Examples include 


the angle sum of a triangle equals two right angles 
and 


some quadratic equations have real solutions. 


4 Chapter 1 PROPOSITIONAL LOGIC 
A proposition that is not an atom but is constructed using other propositions is called 
a compound proposition. There are five types. 


¢ A negation of a given proposition is a proposition that denies the truth of the 
given proposition. For example, the negation of 3+ 8 = 5is3+8 #5. In this 
case, we say that 3 + 8 = 5 has been negated. Negating the proposition the sine 
function is periodic yields the sine function is not periodic. 


¢ A conjunction is a proposition formed by combining two propositions (called 
conjuncts) with the word and. For example, 


the base angles of an isosceles triangle are congruent, 
and a square has no right angles 


is a conjunction with the base angles of an isosceles triangle are congruent and 
a square has no right angles as conjuncts. 


¢ A disjunction is a proposition formed by combining two propositions (called 
disjuncts) with the word or. The sentence 


the base angles of an isosceles triangle are congruent, 
or a square has no right angles 


is a disjunction. 


¢ An implication is a proposition that claims a given proposition (called the an- 
tecedent) entails another proposition (called the consequent). Implications are 
also known as conditional propositions. For example, 


if rectangles have four sides, then squares have for sides (1.1) 


is a conditional proposition. Its antecedent is rectangles have four sides, and its 
consequent is squares have four sides. This implication can also be written as 


rectangles have four sides implies that squares have four sides, 
squares have four sides if rectangles have four sides, 
rectangles have four sides only if squares have four sides, 
and 
if rectangles have four sides, squares have four sides. 
A conditional proposition can also be written using the words sufficient and nec- 


essary. The word sufficient means “adequate” or “enough,” and necessary means 
“needed” or “required.” Thus, the sentence 


Section 1.1 SYMBOLIC LOGIC 5 


rectangles having four sides is sufficient for squares to have four sides 


translates (1.1). In other words, the fact that rectangles have four sides is enough 
for us to know that squares have four sides. Likewise, 


squares having four sides is necessary for rectangles to have four sides 


is another translation of the implication because it means that squares must have 
four sides because rectangle have four sides. Summing up, the antecedent is 
sufficient for the consequent, and the consequent is necessary for the antecedent. 


¢ A biconditional proposition is the conjunction of two implications formed by 
exchanging their antecedents and consequents. For example, 


if rectangles have four sides, then squares have four sides, 
and if squares have four sides, then rectangles have four sides. 


To remove the redundancy in this sentence, notice that the first conditional can 
be written as 


rectangles have four sides only if squares have four sides 
and the second conditional can be written as 
rectangles have four sides if squares have four sides, 
resulting in the biconditional being written as 
rectangles have four sides if and only if squares have four sides 
or the equivalent 


rectangles having four sides is necessary and sufficient 
for squares to have four sides. 


Propositional Forms 


As atypical human language has many ways to express the same thought, it is beneficial 
to study propositions by translating them into a notation that has a very limited collec- 
tion of symbols yet is still able to express the basic logic of the propositions. Once this 
is done, rules that determine the truth values of propositions using the new notation can 
be developed. Any such system designed to concisely study human reasoning is called 
a symbolic logic. Mathematical logic is an example of symbolic logic. 

Let p be a finite sequence of characters from a given collection of symbols. Call 
the collection an alphabet. Call p a string over the alphabet. The alphabet chosen so 
that p can represent a mathematical proposition is called the proposition alphabet and 
consists of the following symbols. 


6 Chapter 1 PROPOSITIONAL LOGIC 


¢ Propositional variables: Uppercase English letters, P,Q, R,..., or uppercase 
English letters with subscripts, P,,Q,, R,,..., wheren =0,1,2,... 


e Connectives: =, A, V, >, 
¢ Grouping symbols: (, ), [, ]. 


The sequences P V Q and P,Q,A © (((and the empty string, a string with no charac- 
ters, are examples of strings over this alphabet, but only certain strings will be chosen 
for our study. A string is selected because it is able to represent a proposition. These 
strings will be determined by a method called a grammar. The grammar chosen for 
our present purposes is given in the next definition. It is given recursively. That is, the 
definition is first given for at least one special case, and then the definition is given for 
other cases in terms of itself. 


@ DEFINITION 1.1.2 


A propositional form is a nonempty string over the proposition alphabet such 
that 


e every propositional variable is a propositional form. 
e —p is a propositional form if p is a propositional form. 


° (pAQq), (pV q), (p > q), and (p © q) are propositional forms if p and q are 
propositional forms. 


We follow the convention that parentheses can be replaced with brackets and 
outermost parenthesis or brackets can be omitted. As with propositions, a propo- 
sitional form that consists only of a propositional variable is an atom. Otherwise, 
it is compound. 


The strings P, O,, —P, (P; V P)) A P3, and (P > QO) A(R © P) are examples of 
propositional forms. To prove that the last string is a propositional form, proceed using 
Definition 1.1.2 by noting that (P —~ Q)A(R © —P) is the result of combining P — Q 
and R < —P with A. The propositional form P — Q is from P and O combined with 
—, and R = =P is from R and =P combined with «. These and —P are propositional 
forms because P, QO, and R are propositional variables. This derivation yields the 
following parsing tree: 


(P>Q)A (Ro 7P) 


ye \ 


PQ Reo-7P 


iS LS 
| 


P 


Section 1.1 SYMBOLIC LOGIC 7 


The parsing tree yields the formation sequence of the propositional form: 
P,O,R,-P,P > Q,Ro7P,(P > Q)A(R& >P). 


The sequence is formed by listing each distinct term of the tree starting at the bottom 
row and moving upwards. 


M@ EXAMPLE 1.1.3 
Make the following assignments: 
p:=Ro(PAQ), 
q:i=(Re P)AQ. 
The symbol := indicates that an assignment has been made. It means that the 
propositional form on the right has been assigned to the lowercase letter on the 


left. Using these designations, we can write new propositional forms using p and 
q. The propositional form p A q is 


[Ro (PAQIA[R& P)AQ] 
with the formation sequence, 


P,QO,R,PAQ,R& P, 
Ro (PAQ) (RS P)AQ [LR (PAQ) A(R & P)A QI, 


and 7q > pis 
(Ro P)AQ] > [Ro (PAQ)] 


with the formation sequence 


P,O,R,Ro P,P AQ,(R& P)AQ,R&(PAQ), 
“A(R @ P)AQI),-R  P)AQ] > [Ro (PAQ)I. 


Interpreting Propositional Forms 


Notice that determining whether a string is a propositional form is independent of the 
meaning that we give the symbols. However, as we do want these symbols to con- 
vey meaning, we assume that the propositional variables represent atoms and set this 
interpretation on the connectives: 


not 

and 

or 

implies 

if and only if 


tl<> 4 


Because of this interpretation, name the compound propositional forms as follows: 


8 Chapter 1 PROPOSITIONAL LOGIC 


ap negation 
PAq_ conjunction 
PVq__ disjunction 
P—>q_ implication 
p<oq_ biconditional 


M@ EXAMPLE 1.1.4 


To see how this works, assign some propositions to some propositional variables: 


P := The sine function is not one-to-one. 
O := The square root function is one-to-one. 


R:= The absolute value function is not onto. 
The following symbols represent the indicated propositions: 


e AR 
The absolute value function is onto. 


e aPVv AQ 
The sine function is one-to-one, or the square root function is not one-to- 
one. 


e OQ a R 
If the square root function is one-to-one, the absolute function is not onto. 


e R oOo P 
The absolute value function is not onto if and only if the sine function is not 
one-to-one. 


°* PAQ 
The sine function is not one-to-one, and the square root function is one-to- 
one. 


e aPA O 
The sine function is one-to-one, and the square root function is one-to-one. 


+ (P/Q) 
It is not the case that the sine function is not one-to-one and the square root 
function is one-to-one. 


The proposition 


the absolute value function is not onto if and only if 
both the sine function is not one-to-one and the square root function is one-to-one 


is translated as R ~ (P A Q) and 


Section 1.1 SYMBOLIC LOGIC 9 


the absolute value function is not onto if and only if the sine function is not 
one-to-one, and the square root function is one-to-one 


is translated as (R < P) A Q. If the parenthesis are removed, the resulting string is 
R« PAQ. Itis simpler, but it is not clear how it should be interpreted. To eliminate 
its ambiguity, we introduce an order of connectives as in algebra. In this way, certain 
strings without parentheses can be read as propositional forms. 


Hi DEFINITION 1.1.5 [Order of Connectives] 


To interpret a propositional form, read from left to right and use the following 
precedence: 


e propositional forms within parentheses or brackets (innermost first), 
* negations, 

¢ conjunctions, 

e disjunctions, 

¢ conditionals, 


e biconditionals. 


M@ EXAMPLE 1.1.6 


To write the propositional form =P V O A R with parentheses, we begin by inter- 
preting —P. According to the order of operations, the conjunction is next, so we 
evaluate OA R. This is followed by the disjunction, and we have the propositional 
form ~P V (QA R). 


@ EXAMPLE 1.1.7 


To interpret P A Q V R correctly, use the order of operations. We discover that 
it has the same meaning as (P A Q) V R, but how is this distinguished from 
P A(QVY R) in English? Parentheses are not appropriate because they are not 
used as grouping symbols in sentences. Instead, use either... or. Then, using the 
assignments from Example 1.1.4, (P A Q) Vv R can be translated as 


either the sine function is not one-to-one 
and the square root function is one-to-one, 
or the absolute value function is not onto. 


Notice that either...or works as a set of parentheses. We can use this to translate 
PA(QV R): 


the sine function is not one-to-one, 
and either the square root function is one-to-one 
or the absolute value function is not onto. 


10 


Chapter 1 PROPOSITIONAL LOGIC 


Be careful to note that the either-or phrasing is logically inclusive. For instance, 
some colleges require their students to take either logic or mathematics. This 
choice is meant to be exclusive in the sense that only one is needed for graduation. 
However, it is not logically exclusive. A student can take logic to satisfy the 
requirement yet still take a math class. 


EXAMPLE 1.1.8 


Let us interpret -(P A Q). We can try translating this as not P and Q, but this 
represents —P AQ according to the order of operations. To handle a propositional 
form such as =(P A Q), use a phrase like it is not the case or it is false and the 
word both. Therefore, -(P A Q) becomes 


it is not the case that both P and Q 
or 
it is false that both P and Q. 


For instance, make the assignments. 


P := quadratic equations have at most two real solutions, 


O := the discriminant can be negative. 


Then, 


quadratic equations do not have at most two real solutions, 
and the discriminant can be negative 


is a translation of =P A Q. On the other hand, —=(P A Q) can be 


it is not the case that both quadratic equations have at most two real solutions 
and the discriminant can be negative. 


To interpret ~P A 7Q, use neither-nor: 


neither do quadratic equations have at most two real solutions, 
nor can the discriminant be negative. 


Valuations and Truth Tables 


Propositions have truth values, but propositional forms do not. This is because every 
propositional form represents any one of infinitely many propositions. However, once a 
propositional form is identified with a proposition, there should be a process by which 
the truth value of the proposition is associated with the propositional form. This is 
done with a rule v called a valuation. The input of v is a propositional form, and its 
output is T or F. Suppose that P is a propositional variable. If P has been assigned a 


proposition, 


T if P is true, 


V(P) = cae 
F if P is false. 


Section 1.1 SYMBOLIC LOGIC 11 


For example, if P :=2+3 =5, then v(P) = T, andif P :=2+3 =7, then v(P) =F. 
If P has not been assigned a proposition, then v(P) can be defined arbitrarily as either 
TorF. 

The valuation of a compound propositional form is defined using truth tables. Let 
p and q be given propositional forms. Along the top row write p and, if needed, q. 
Draw a vertical line. To its right identify the desired propositional form consisting 
of p, possibly q, and a single connective. In the body of the table, on the left of the 
vertical line are all combinations of T and F for p and possibly q. On the right are the 
results of applying the connective. Each connective will have its own truth table, and 
we want to define these tables so that they match our understanding of the meaning of 
each connective. 

Since the truth value of the negation of a given proposition is the opposite of that 
proposition’s truth value, 


|p| 7p 
T| F 
F| T 


This means that v(—p) = F if v(p) = T and v(77p) = T if v(p) = F. 
The conjunction, 


3 +6 =9, and all even integers are divisible by two, 
is true, but 
all integers are rational, and 4 is odd 
is false because the second conjunct is false. The disjunction 
3+ 7=9, or all even integers are divisible by three, 
is false since both disjuncts are false. On the other hand, 
3+7=9, or circles are round 
is true. This illustrates that 
e aconjunction is true when both of its conjuncts are true, and false otherwise, and 
e a disjunction is true when at least one disjunct is true, and false otherwise. 


We use these principles to define the truth tables for p A q and p V q: 


P 4|PAQq P o4|PV4Q 
T T/ T T TT] T 
T F| F TY BT 
F T| F F T) T 
F F | F F Fy) F 


We must remember that only one disjunct needs to be true for the entire disjunction to 
be true. For this reason, the logical disjunction is sometimes called an inclusive or. 
The propositional form for the exclusive or is 


(DV q)A~(p Aq). 


12 Chapter 1 PROPOSITIONAL LOGIC 
There are many ways to understand an implication. Sometimes it represents causa- 
tion as in 
if I score at least 70 on the exam, I will earn a passing grade. 


Other times it indicates what would have been the case if some past event had gone 
differently as in 


if I had not slept late, I would not have missed the meeting. 


Study of such conditional propositions is a very involved subject, one that need not 
concern us here because in mathematics a simpler understanding of the implication 
is enough. Suppose that P and Q are assigned propositions so that P > Q is a true 
implication. In mathematics, this means that it is not the case that P is true but Q is 
false. This understanding of the conditional is known as material implication. For 
example, 


if rectangles have four sides, then squares have four sides, 

if rectangles have three sides, then squares have four sides, 
and 

if rectangles have three sides, then squares have three sides 
are all true, but 

if rectangles have four sides, then squares have three sides 


is false. Generalizing, in mathematics, p > gq means —=(pA7q), which has the following 
truth table: 


(p qa|[7q pArvq ~(pAnq) 
Th. 6 T 
T F|T F 
FT|F F T 
BOP? °F T 


The truth table for p — q is then defined as follows: 


Poqd|P74 
T T T 
T F F 
F T T 
F F T 


The truth table for p @ q is simpler because we understand p ~ g to mean 


(p> QAQ- Pp). 


The truth table for this propositional form requires five columns: 


Section 1.1 SYMBOLIC LOGIC 13 


P 4|P>qd 4>P PrQAG>P) 
Toor. ii T 
T F/| F T F 
F T| T F F 
F F] T T ay 


Therefore, define the truth table of p © q as: 


ip qa|peoq 
pier 
TF |) # 
Po? | 8 
F F| T 


This understanding of the biconditional is known as material equivalence. 
Using these truth tables, the valuation of an arbitrary propositional form can be 
defined. 


@ DEFINITION 1.1.9 


Let p and q be propositional forms. 


apy LT BVO =F. 
BOS VB Ry) =e 


ae T if v(p) = T and v(q) =T, 
ev => 
rae F otherwise. 


pvaad tiv) =Fand vq) =F, 
ev => 
pee T otherwise. 


otherwise. 


_ JT ifv(p) = v@), 
*Vpog= 7 


F if v(p) = T and v(q) = F, 
* vip > q)= r 


otherwise. 


M@ EXAMPLE 1.1.10 


Consider the propositional form (P @ Q) V (R > P) where 
v(P) = F, v(Q) = T, and v(R) = F. 


Then, v(P «~ Q) = F because v(P) # v(Q), and v(R — P) = T because 
v(R) = F. Therefore, because v.R — P) = T, 


vV([P + Q] V[R > P]) =T, 


14 Chapter 1 PROPOSITIONAL LOGIC 


We now generalize the definition of a truth table to create truth tables for more com- 
plicated propositional forms and then use the tables to find the valuation of a proposi- 
tional form given the valuations of its proposition variables. 


@ EXAMPLE 1.1.11 


To write the truth table of P > QA-P, identify the column headings by drawing 
the parsing tree for this form: 


P>QA-P 
— 
P QA-7P 
a 
OQ aP 


Reading from the bottom, we see that a formation sequence for the propositional 
form is 
P,Q,7P,QA7P,P xd OAH-7P. 


Hence, the truth table for this form is 


P Q|7AP QA7AP P>QA-P 
T T/ F F F 
T F/ F F F 
F T/ T T T 
F F/ T F T 


So, if v(P) = T and v(Q) = F, 
vV(P > OAAP)=F. 


That is, any proposition represented by P > QAP is false when the proposition 
assigned to P is true and the proposition assigned to Q is false. 


The propositional form in the next example has three propositional variables. To 
make clear the truth value pattern that is to the left of the vertical line, note that if there 
are n variables, the number of rows is twice the number of rows for n — 1 variables. 
To see this, start with one propositional variable. Such a truth table has only two rows. 
Add a variable, we obtain four rows. The pattern is obtained by writing the one variable 
case twice. For the first time, it has a T written in front of each row. The second copy 
has an F in front of each row. To obtain the pattern for three variables, copy the two- 
variable pattern twice as in Figure 1.1. To generalize, if there are n variables, there will 
be 2” rows. 


Section 1.1 SYMBOLIC LOGIC 


Eight 
TOWS 


P 
Two { T ets Four 
F 


TOWS 


m4 4] 
Hs Sto 
TMimmDMmAaAHnHAly 
DAMA Ds See BIIO 
TMHmaAasasA eS 


Figure 1.1 Valuation patterns. 


M EXAMPLE 1.1.12 
Use a truth table to find the truth value of 


if the derivative of the sine function is the cosine function 
and the second derivative of the sine function is the sine function, 
then the third derivative of the sine function is the cosine function. 


Define: 


P := the derivative of the sine function is the cosine function, 
O := the second derivative of the sine function is the sine function, 
R:= the third derivative of the sine function is the cosine function. 
So P represents a true proposition, but Q and R represent false propositions. The 


proposition is represented by 
PAQ>R 


with truth table: 


PAQ PAQ>R 


7Dmmame seo 
Ades nao 
THAAD ATSA> 
AISA 
BxxAeA STA 


Notice that we could have determined the truth value by simply writing one line 
from the truth table: 


P O R|PAOQ PAO=R 
T F Fl F T 


We see that v.P A Q — R) = T when v(P) = T, v(Q) = F, and v(R) = F. 
Therefore, the proposition is true. 


15 


16 Chapter 1 PROPOSITIONAL LOGIC 


M@ EXAMPLE 1.1.13 


Both P Vv ~P and P => P share an important property. Their columns in their 
truth tables are all T. For example, the truth table of P Vv —P is: 


P|73P PV-AP 
T| F T 
F/ T T 


Therefore, v(P V ~P) always equals T, no matter the choice of v. 
However, the columns for P A =P and P «© -P are all F. To check the first 
one, examine its truth table: 


P|73P PAP 
T| F F 
F/ T F 


This means that v(P A —P) is always F for every valuation v. 


Based on the last example, we make the next definition. 


@ DEFINITION 1.1.14 


A propositional form p is a tautology if v(p) always equals T for every valuation 
v, and p is a contradiction if v(p) always equals F for every v. A propositional 
form that is neither a tautology nor a contradiction is called a contingency. 


Exercises 


1. Identify each sentence as either a proposition or not a proposition. Explain. 


(a) 
(b) 
(c) 
(d) 
(e) 
(f) 
(g) 
(h) 
(i) 
G) 


Trisect the angle. 

Some exponential functions are increasing. 
All exponential functions are increasing. 
3+8=18 

34+x=18 

Yea, logic! 

A triangle is a three-sided polygon. 

The function is differentiable. 

This proposition is true. 

This proposition is not true. 


2. Identify the antecedent and the consequent for the given implications. 


(a) 
(b) 
(c) 
(d) 
(e) 


If the triangle has two congruent sides, it is isosceles. 

The polynomial has at most two roots if it is a quadratic. 

The data is widely spread only if the standard deviation is large. 

The function being constant implies that its derivative is zero. 

The system of equations is consistent is necessary for it to have a solution. 


Section 1.1 SYMBOLIC LOGIC 17 


(f) A function is even is sufficient for its square to be even. 


3. Give the truth value of each proposition. 

(a) A system of equations always has a solution, or a quadratic equation always 
has a real solution. 

(b) It is false that every polynomial function in one variable is differentiable. 

(c) Vertical lines have no slope, and lines through the origin have a positive 
y-intercept. 

(d) Every integer is even, or every even natural number is an integer. 

(e) If every parabola intersects the x-axis, then an ellipse has only one vertex. 

(f) The sine function is periodic if and only if every exponential function is al- 
ways nonnegative. 

(g) Itis not the case that2 +44 6. 

(h) The distance between two points is always positive if every line segment is 
horizontal. 

(i) The derivative of a constant function is zero is necessary for the product rule 
to be true. 

(j) The derivative of the sine function being cosine is sufficient for the derivative 
of the cosine function being sine. 

(k) Any real number is negative or positive, but not both. 


4. For each sentence, fill in the blank using as many of the words and, or, if, and if and 
only if as possible to make the proposition true. 


(a) Triangles have three sides 3+5=6. 

(b) 3+5=6 triangles have three sides. 

(c) Ten is the largest integer zero is the smallest integer. 

(d) The derivative of a constant function is zero tangent lines for in- 


creasing functions have positive slope. 


5. Extend Figure 1.1 by writing the typical pattern of Ts and Fs for the truth table of 
a propositional form with four propositional variables and then with five propositional 
variables. 


6. Use a parsing tree to show that the given string is a propositional form. 
(a) PAQVR 
(b) OS RV-O 
(c) PoQ-R-S 
(d) APAQV(P>Q)A7S 
(ec) (PAQ>Q)AP-Q 
(ff) 7axAPVPAS ~QvV[R->7AP > 7A(QV R)] 


7. Define: 


P := the angle sum of a triangle is 180, 
0 :=3+4+7= 10, 


R:= the sine function is continuous. 


18 Chapter 1 PROPOSITIONAL LOGIC 


Translate the given propositional forms into English. 


(a) 
(b) 


8. Write the following sentences as propositional forms using the variables P, O, and 


PVQ 
PAQ 
PA7AQ 
QvV-7AR 
Q<o-7R 
R-O 
PvVR--7Q 
QO<—RA-7O 
“(PA Q) 
“(Pv Q) 
PVQAR 
(PVQ)AR 


Ras defined in Exercise 7. 


(a) 
(b) 
(c) 
(d) 


(e) 
(f) 


9. Let v(P) = T, v(Q) = T, v(R) = F, and v(S) = F. Find the given valuations. (See 


The sine function is continuous, and 3 +7 = 10. 


The angle sum of a triangle is 180, or the angle sum of a triangle is 180. 


If3 +7 = 10, then the sine function is not continuous. 


The angle sum of a triangle is 180 if and only if the sine function is continu- 


ous. 


The sine function is continuous if and only if 3 +7 = 10 implies that the 


angle sum of a triangle is not 180. 
It is not the case that 3 + 7 # 10. 


Exercise 6.) 


(a) 
(b) 
(c) 
(d) 
(e) 
(f) 


PAQVR 

QO<—Rv-7AO 

P-Q-R-S 

aAPAQV(P > QOQ)AA7AS 
(PAQ>O)AP-Q 
aaPVPAS > QvV[R->7AP > 7(QV R)] 


10. Write the truth table for each of the given propositional forms. 


(a) 
(b) 


aP>P 

P—--7Q 

(PV OVA7(P AQ) 
(P-OQ)VQe<P) 
PA(QVR) 
PVQ->R 
P>QA-(RV P) 
P-QeR-S 
PV(7QASRAQ 


Q) 


Section 1.2 INFERENCE 19 


(“PV Q)A(P > Q]v 7S) 


11. Check the truth value of these propositions using truth tables as in Example 1.1.12. 


(a) 
(b) 


(c) 
(d) 


(e) 


If2+3=7, then5—9 #0. 

If a square is round implies that some functions have a derivative at x = 2, 
then every function has a derivative at x = 2. 

Either four is odd or two is even implies that three is even. 

Every even integer is divisible by 4 if and only if either 7 divides 21 or 9 
divides 12. 

The graph of the tangent function has asymptotes, and if sine is an increasing 
function, then cosine is a decreasing function. 


12. If possible, find propositional forms p and q such that 


(a) 
(b) 
(c) 
(d) 
(e) 
(f) 


PAq isa tautology. 

PV qisacontradiction. 
ap is a tautology. 

Pp > qisacontradiction. 
po qisa tautology. 

p © qisacontradiction. 


1.2 INFERENCE 


Now that we have a collection of propositional forms and a means by which to inter- 
pret them as either true or false, we want to define a system that expands these ideas to 
include methods by which we can prove certain propositional forms from given propo- 
sitional forms. What we will define is familiar because it is similar to what Euclid did 
with his geometry. Take, for example, the familiar result, 


opposite angles in a parallelogram are congruent. 


In other words, 


if ABCD is a parallelogram, then ZB = ZD, 


which translates to: 


Given: ABCD isa parallelogram, 
Prove: ZB2ZD. 


To demonstrate this, we draw a diagram, 


A B 


D c 


and then write a proof: 


20 Chapter 1 PROPOSITIONAL LOGIC 


1. ABCDisaparallelogram Given 

2. Join AC Postulate 

3. ZDAC=ZBCA Alternate interior angles 
4. ZACD=ZCAB Alternate interior angles 
5. AC = AC Reflexive 

6. AACD 2 /ACAB ASA 

7. ZB2=ZD Corresponding parts 


Euclid’s geometry consists of geometric propositions that are established by proofs 
like the above. These proofs rely on rules of logic, previously proved propositions 
(lemmas, theorems, and corollaries), and propositions that are assumed to be true (the 
postulates). Using this system of thought, we can show which geometric propositions 
follow from the postulates and conclude which propositions are true, whatever it means 
for a geometric proposition to be true. Euclidean geometry serves as a model for the 
following modern definition. 


@ DEFINITION 1.2.1 
A logical system consists of the following: 

e An alphabet 

e A grammar 

e Propositional forms that require no proof 

¢ Rules that determine truth 

e Rules that are used to write proofs. 
Although Euclid did not provide an alphabet or a grammar specifically for his geometry, 
his system did include the last three aspects of a logical system. In this chapter we 
develop the logical system known as propositional logic. Its alphabet, grammar, and 
rules that determine truth were defined in Section 1.1. The remainder of this chapter is 


spent establishing the other two components. 
Consider the following collection of propositions: 


If squares are rectangles, then squares are quadrilaterals. 
Squares are rectangles. 
Therefore, squares are quadrilaterals. 


This is an example of a deduction, a collection of propositions of which one is supposed 
to follow necessarily from the others. In this particular case, 


if squares are rectangles, then squares are quadrilaterals 


and 


Section 1.2 INFERENCE 21 


squares are rectangles 
are the premises, and 
squares are quadrilaterals 


is the conclusion. We recognize that in this case, the conclusion does follow from the 
premises because whenever the premises are true, the conclusion must also be true. 
When this is the case, the deduction is semantically valid, else it is semantically in- 
valid. 

Notice that not only do we see that the deduction works because of the meaning of 
the propositions, but we also see that it is valid based on the forms of the sentences. In 
other words, we also recognize this deduction as valid: 


If Hausdorff spaces are preregular, their points can be separated. 
Hausdorff spaces are preregular. 
Therefore, their points can be separated. 


Although we might not know the terms Hausdorff space, preregular, and separated, we 
recognize the deduction as valid because it is of the same pattern as the first deduction: 


p>4q 
P (1.2) 
= 


When the deduction is found to work based on its form, the deduction is syntactically 
valid, else it is syntactically invalid. 

We study both types of validity by examining general patterns of deductions and 
choosing rules that determine which forms correspond to deductions that are valid se- 
mantically and which forms correspond to deductions that are valid syntactically. 


Semantics 


The study of meaning is called semantics. We began this study when we wrote truth 
tables. These are characterized as semantic because the truth value of a proposition is 
based on its meaning. Our goal is to use truth tables to determine when an argument 
form, an example being (1.2), corresponds to a deduction that is semantically valid. 
We begin with a definition. 


@ DEFINITION 1.2.2 


Let po, Pj, --- Pn, and q be propositional forms. 
e If q is a tautology, write F q. 


¢ Define po, pj, ---,Py—1 to logically imply q if 


F po Api Av? APya1 >@- 


22 Chapter 1 PROPOSITIONAL LOGIC 


When po, P}.--->Py—1 logically imply q, write 


Po> Py>+++>Py—1 F 4. 


and say that qg is a consequence of po, pj, ... ,P,—1- Call the propositional 
forms po, P},.--,P,—1 the premises of the implication and gq the conclu- 
sion. 


Notice that if po, pj, .-.,P,—1 F q, then for any valuation v, whenever v(p,) = T for all 
i = 0,1,...,2— 1, it must be the case that v(q) = T. Moreover, any deduction with 
premises represented by po, p),.--,P,—, and conclusion by q is semantically valid if 


Po> P1> +++ >Pn-1 a q. 
@ EXAMPLE 1.2.3 
Because of Example 1.1.13, both F P — PandFE PvP. 


EXAMPLE 1.2.4 
Prove: P-~ O,PEQO 


To accomplish this, show that the propositional form 
(P > Q)AP>Q, 


with antecedent equal to the conjunction of the premises and consequent consist- 
ing of the conclusion is a tautology. 


|P O|P>=Q (P=QAP (P=QOAP-O 
T T T T T 
T F F F T 
F T T F T 
F F T F T 


Therefore, F (P — Q) A P > QO, so 
P>+Q,PFQ, 


and any deduction based on this form is semantically valid. 


EXAMPLE 1.2.5 
Prove: PVQ > Q,P EQ 


PVQ PVO>Q (PVO>Q)AP_ (PVO>OQ)AP=0Q | 


ig ole oar Mora) 
mMHmeo 
maa” 


Bane 
po ies iie sie | 
=| 


If it is possible for v(p;) = T for i = 0,1, ...,— 1 yet v(q) = F, the propositional form 


PoNP{ A+++ A Py-1 > 49 


Section 1.2 INFERENCE 23 
is not a tautology, so q is not a consequence of po, Pp), .--, Py—- If this is the case, write 


Pos P>+++>Py—-1 F 4: 


EXAMPLE 1.2.6 
Prove: PAQ ~Q0,PFOQO 


P O|PAQ PAO>O (PAO>O)AP (PAO>O)AP-O 
T T T T T T 
T F F T T F 
FT F T F T 
FF F T F T 


Notice that F appears for (P AQ > Q) A P > Q ona line when 
VPAQ > Q)=v(P)=T 


yet v(Q) = F. Because of this, we can shorten the procedure for showing that a propo- 
sitional form is not a consequence of other propositional forms. 


@ EXAMPLE 1.2.7 
Prove: PAQ> REFP->R 


P OQ R|PAQ7>R P-R 
T F F T F 


Observe that this shows that the valuation of P AQ — Rcan be T at the same 
time that the valuation of P > Ris F. 


Syntactics 


Although we will return to semantics, it is important to note that using truth tables to 
check for logical implication has its limitations. If the argument form involves many 
propositional forms or if the propositional forms are complicated, the truth table used 
to show or disprove the logical implication can become unwieldy. Another issue is 
that in practice, truth tables are not the method of choice when determining whether 
a conclusion follows from the premises. What is typically done is to follow Euclid’s 
example (page 20), basing the conclusions on the syntax of the argument form, namely, 
based only on its pattern and structure. 

Let us return to the deduction on page 20 and work with it differently. Start with the 
two propositions, 


if squares are rectangles, then squares are quadrilaterals 
and 
squares are rectangles. 


Because of the combined structure of the two sentences, we know that we can write 


24 Chapter 1 PROPOSITIONAL LOGIC 


squares are quadrilaterals. 


This act of writing (on paper or a blackboard or in the mind) means that we have the 
proposition and that it follows from the first two. Similarly, if we start with 


squares are triangles, or squares are rectangles 
and 
squares are not triangles, 
we can write 
squares are rectangles. 


Determining a method that will model this reasoning requires us to find rules by 
which propositional forms can be written from other propositional forms. Since ev- 
ery logical system requires a starting point, the first step in this process is to choose 
which propositional forms can be written without any prior justification. Each such 
propositional form is called an axiom. Playing the same role as that of a postulate in 
Euclidean geometry, an axiom can be considered as a rule of the game. Certain propo- 
sitional forms lend themselves as good candidates for axioms because they are regarded 
as obvious. That is, they are self-evident. Other propositional forms are good candi- 
dates to be axioms, not because they are necessarily self-evident, but because they are 
helpful. In either case, the number of axioms should be as few as possible so as to min- 
imize the number of assumptions. For propositional logic, we choose only three. They 
were first found in work of Gottlob Frege (1879) and later in that of Jan Lukasiewicz 
(1930). 


H@ AXIOMS 1.2.8 [Frege—Lukasiewicz] 
Let p, gq, and r be propositional forms. 
* [FL1] p> (q > p) 
> [FL2] p> @>r) > @> a> rr) 
+ [FL3] =p > -q > @ > p). 


The next step in defining propositional logic is to state when it is legal to write a 
propositional form from given propositional forms. 


@ DEFINITION 1.2.9 


The propositional forms po, pj,...,P,—; infer q if q can be written whenever 
Po: P1>+++>Py— are written. Denote this by 


Po> Ps +++ >Pn-1 > 4: 
This is known as an inference. 
To make rigorous which propositional forms can be inferred from given forms, we 


establish some rules. These are chosen because they model basic reasoning. They are 
also not proved, so they serve as postulates for our logic. 


Section 1.2 INFERENCE 25 


@ INFERENCE RULES 1.2.10 


Let p, q, r, and s be propositional forms. 


e Modus Ponens [MP] 
P?QaP>q 


Modus Tolens [MT] 
P?q4,-q¢4> 7p 


Constructive Dilemma [CD] 
(p> QA s),pVr>qvs 


Destructive Dilemma [DD] 
(p> QA > 5),7qgV 78 > apV or 


Disjunctive Syllogism [DS] 
PVQ,7p>4q 


Hypothetical Syllogism [HS] 
prqaq7r>por 


Conjunction [Conj] 
P3d > PA 


Simplification [Simp] 
PAQ>pP 


Addition [Add] 
p> PV 4. 


To use Inference Rules 1.2.10, match the form exactly. For example, even though 
P — Rappears to follow from (P AQ) — Ras an application of simplification, it does 
not. The problem is that simplification can only be applied to propositional forms with 
the p A q pattern, but (P A Q) > R is of the form p > q. With this detail in mind, we 
make some inferences. 


WM EXAMPLE 1.2.11 
Each inference is justified by the indicated rule. 


e Modus ponens 
PAQ>7AR,PAQ>S>7R 


e Addition 
P>PVQAR 


e Modus tolens 
a4AP,OVR>7AP>-(OV R). 


26 Chapter 1 PROPOSITIONAL LOGIC 


WM EXAMPLE 1.2.12 


Since it is possible that some propositional forms are not needed for the inference, 
we also have the following: 


¢ Modus ponens 
PAQ>A7AR,PVOQ,PAQ,0°0S>7R 


e Addition 
P,R,S-~T>PVQAR 


e Modus tolens 
38,35P,PAT,OVR-7AP>-7(OV R). 


Inference is a powerful tool, but it can only be used to check simple deductions. 
Sometimes multiple inferences are needed to move from a collection of premises to a 
conclusion. For example, if we write 


PVG7D.q > Tr, 
based on the first two propositional forms, we can write 


q 


by DS, and then based on this propositional form and the third of the given propositional 
forms, we can write 
r 


by MP. This is a simple example of the next definition. 


HM DEFINITION 1.2.13 


¢ A formal proof of the propositional form q (the conclusion) from the proposi- 
tional forms po, pj), .-. , P,—1 (the premises) is a sequence of propositional forms, 


Po>P> see »Pn-1> qd0> q1> see >Im-1> 


such that q,,_; = q, and for all i = 0, 1,...,m — 1, either q, is an axiom, 
if i = 0, then po, py, .-.,Py_1 > Gj, OF 
ifi> 0, then Po> P15 +++ > Pn—1> W0> W1> «++ > Vi-1 > qi- 


If there exists a formal proof of q from po, p),..., P,—1, then q is proved or de- 
duced from po, pj, .-.,P,—1 and we write 


D0> Pis+++>Pn—1 © 4: 
e If there are no premises, a formal proof of q is a sequence, 


qd0> q1> see >Im-1> 


Section 1.2 INFERENCE 27 


such that gg is an axiom, q,,_; = q, and for all i > O, either q; is an axiom or 


90> M19 +++ Gi-1 > 4- 


In this case, write F q and call q a theorem. 
Observe that any deduction with premises represented by po, pj, ..., P,—, and conclu- 
sion by gq is syntactically valid if po, py, .--.Py—1 F @- 
We should note that although > and F have different meanings as syntactic symbols, 
they are equivalent. If p > q, then p | q using the proof p,q. Conversely, suppose 
pt q. This means that there exists a proof 


Ps 40> 91> +++ > In-1> 4 


so every time we write down p, we can also write down q. That is, p > q. We summa- 
rize this as follows. 


@ THEOREM 1.2.14 
For all propositional forms p and q, p > q if and only if pF q. 


We use a particular style to write formal proofs. They will be in two-column format 
with each line being numbered. In the first column will be the sequence of propositional 
forms that make up the proof. In the second column will be the reasons that allowed 
us to include each form. The only reasons that we will use are 


e Given (for premises), 
e FLI, FL2, or FL3 (for an axiom), 
e An inference rule. 


An inference rule is cited by giving the line numbers used as the premises followed by 
the abbreviation for the rule. Thus, the following proves PVQ > OA R,PEQ: 


l1.PVQ->QAR _ Given 
2, 2P. Given 
3. PVQ 2 Add 
4. QAR 1,3 MP 
5. O 4 Simp 


Despite the style, we should remember that a proof is a sequence of propositional forms 
that satisfy Definition 1.2.13. In this case, the sequence is 


PVOSOAR,P,PVO,OAR,QO. 


The first two examples involve proofs that use the axioms. 


28 Chapter 1 PROPOSITIONAL LOGIC 


EXAMPLE 1.2.15 
Prove: | P>~ Q > (P - P) 


l1Po-(Q->P) 


FLI 


2. P-~(Q->P)-~(P-Q-[P-P]) FL2 


3. P>Q->(P->P) 


MP 


This proves that P ~ Q > (P > P) isa theorem. Also, by adding P — Qasa 
given and an application of MP at the end, we can prove 


P-QOFP>-P. 


This result should not be surprising since P > P is atautology. We would expect 


any premise to be able to prove it. 


M EXAMPLE 1.2.16 
Prove: -(Q > P),aP + 7=Q 


1. 7(Q > P) Given 
Ds Sue Given 
3. 3aP>~7AQO-(Q—-P) FL3 

4. 3P>-7Q 1,3 MT 
5. 71Q 2, 4 MP 


The next three examples do not use an axiom in their proofs. 


M@ EXAMPLE 1.2.17 


Prove: P > O0,Q — R, SV AR,73S + 7aP 


P-Q 
Q->R 
SV-aAR 
aS 
PR 
aR 
aP 


SEY GRO 


Given 
Given 
Given 
Given 
1,2 HS 
3,4DS 
5,6 MT 


M@ EXAMPLE 1.2.18 


Prove: P>QO,P-Q->(T > S),PVT,-OtF S 


1 PoQa 


Given 


2. P>~Q->(T->S)_ Given 


3. PVT 


Given 


4. 7AO 

5. TOS 

6. (P-Q)AT > S) 
7. OVS 

8 OS 


Section 1.2 INFERENCE 


Given 
1,2 MP 
1, 5 Conj 
3,6CD 
4,7DS 


M EXAMPLE 1.2.19 
Prove: P > O,Q — R, ARF 7AQv AP 


P-@Q 

Q->R 

aR 

(Q> R)A(P > Q) 
ARV 7=Q 

QV -7P 


ON iO NS 


Given 
Given 
Given 

1, 2 Conj 
3 Add 
4,5 DD 


Exercises 


1. Show using truth tables. 


(a) 
(b) 
(c) 
(d) 
(e) 


aPVQ,7=Q F AP 
“(PAQ),P FAQ 
P->QO,PEOVR 
P-~Q0,O0O->R,PER 
PVQOQAR,7APER 


2. Show the following using truth tables. 


(a) 
(b) 
(c) 
(d) 
(e) 
(f) 


“(PA OQ) 7P 
Ps>OvVR,PKO 
PAQ>REFQ-R 

(P > Q)V(R->S)PVRFOVS 
APAQ)VR,PAQVSERAS 
PVROVS,ROSERAS 


3. Identify the rule from Inference Rules 1.2.10. 


(a) 
(b) 


P-Q->P,P-Q>P 
P,OVR>PACQVR 
P>PVvV(Re7APA-[0 - S]) 
P,P-(QeS)20¢S8S 


PVOVO(PVOSOAQSSAT)ZSOVSAT 


PV(QVS),7AP>QVS 
P—--70,7-7-0 => =P 


(P > OVA(O > R), 7-OV7AR=>-7-PV-7O 


(P-QAQ>-R>P-Q 


29 


30 Chapter 1 PROPOSITIONAL LOGIC 


4. Arrange each collection of propositional forms into a proof for the given deductions 
and supply the appropriate reasons. 

(a) Po Q,R-S,PFOVS 
° P 
« OVS 
© RS 
© (PS QAR S) 
« PVR 
« P-OQ 

(b) P~O,O>R,PERVO 
° P 
e POR 
- P-OQO 
° R 
- O-R 
« RVQ 

(c) (P>~OV(OQ-—R),AP > Q),7AR,OVSES 
e -(P > Q) 
- S 
* (P>-Q)VQ-R) 
- O-R 
° AR 
- 0 
« OVS 

(d) (PVQ)AR,OQVS ~T,APEAPAT 
- Pv@Q 
> QO 
- OVS 
- T 
© (PVO)AR 
e OVS ~T 
e aP 
© aPAT 


5. Prove using Axioms 1.2.8. 
(a) F}P—>P 
(b) F7a4nP > P 
(c) FP>(P>[@ => P)) 
(d) “PF P-Q 
(ec) P>Q,Q-> R}t P = R (Do not use HS.) 
(f) P—-Q,7Q+t 7AP (Do not use MT.) 


Section 1.3 REPLACEMENT 31 


6. Prove. Axioms 1.2.8 are not required. 
(a) P>O,PV(R->S),7-OQFR-S 
(b) P>O,O0>R,7AR/E AP 
(c) PO O,R>S,7OVASt'APVAR 
(d) [(P~(Q- R)JA[O ~ (R- P)], PV O,7-(0 - R), APF AR 
(e) Po~Q-(R-S),S-T,P>QO,RET 
ff PoQAROVS-~TAU,PET 
(g) PSP QAR,AQAR)OARVGPO S)F S 
(h) PVQ>7ARAA7S,O > R,P+ 7-0 
G) NoOP,PVOVR>-SVT,SVT ~T,NET 
Gg) P-Q,Q-R,ROS,S3T,PVR,AREFT 
(k) PVO>SORVS,(ROT)A(S — U), P,AT FU 
qd) Pe-QQ7>R, ROS (PVQ)ACRVS)FOVS 
(m) PV7AOQvVR->(S > P), PV7Q > (P> R),PEFS>R 


1.3 REPLACEMENT 


There are times when writing a formal proof that we want to substitute one propositional 
form for another. This happens when two propositional forms have the same valuations. 
It also happens when a particular sentence pattern should be able to replace another 
sentence pattern. We give rules in this section that codify both ideas. 


Semantics 


Consider the propositional form —=(P v Q). Its valuation equals T when it is not the 
case that v(P) = T or v(Q) = T (Definition 1.1.9). This implies that v.P) = F and 
v(Q) = F, so (=P A7Q) = T. Conversely, the valuation of —P A =Q is T implies that 
the valuation of =(P Vv Q) is T for similar reasons. Since no additional premises were 
assumed in this discussion, we conclude that 


v(“[P V Q]) = V>P A7Q), (1.3) 


and this means that 
a(P VQ) @7APA7AQ 


is a tautology. There is a name for this. 
@ DEFINITION 1.3.1 
Two propositional forms p and q are logically equivalent if F p < q. 


Observe that Definition 1.3.1 implies the following result. 


@ THEOREM 1.3.2 


All tautologies are logically equivalent, and all contradictions are logically equiv- 
alent. 


32 Chapter 1 PROPOSITIONAL LOGIC 


Because of (1.3), we can use a truth table to prove logical equivalence. 


@ EXAMPLE 1.3.3 
Prove: F -(P VQ) @ =P A7-7Q. 


P Q|PVQ ~(PVQ) =P 7=Q =PA-7O 
T T T F F F F 
T F T F F T F 
F T T F T F F 
F F F T dee oD T 


M@ EXAMPLE 1.3.4 


Because F P > Q © =P VQ, we can replace P > Q with “=P V Q, and vice 
versa, at any time. When this is done, the resulting propositional form is logically 
equivalent to the original. For example, 


FQOA(P>O)eQACPVQ). 


To see this, examine the truth table 


|P Q|P-O Qa(P>Q) -P -PvVO QA(=PVQ) 
T T T T F T T 
T F F F F F F 
F T T T T T T 
F F T F T T F 


@ EXAMPLE 1.3.5 
Consider R A (P > Q). Since F P — Q © 7Q > FP [Exercise 2(f)], we can 
replace P > Q with -Q —> -P giving 
FEF RA(P>Q)<RA(7Q > HP). 


When studying an implication, we sometimes need to investigate the different ways 
that its antecedent and consequent relate to each other. 


@ DEFINITION 1.3.6 
The converse of a given implication is the conditional proposition formed by 
exchanging the antecedent and consequent of the implication (Figure 1.2). 

HM DEFINITION 1.3.7 


The contrapositive of a given implication is the conditional proposition formed 
by exchanging the antecedent and consequent of the implication and then replac- 
ing them with their negations (Figure 1.3). 


If Cite ameceden) then 


Section 1.8 REPLACEMENT 


the consequent. 


If 


the consequent, then 


I 


Figure 1.2. Writing the converse. 


f ( the antecedent, ) then 


the consequent. 


If 


not the consequent, | then ( not the antecedent. 


Figure 1.3. Writing the contrapositive. 


For example, the converse of 


if rectangles have four sides, squares have for sides 


is 


if squares have four sides, rectangles have four sides, 


and its contrapositive is 


if squares do not have four sides, rectangles do not have four sides. 


33 


Notice that a biconditional proposition is simply the conjunction of a conditional with 


its converse. 


@ EXAMPLE 1.3.8 


The propositional form P > Q has Q — P as its converse and 7Q — —P as its 
contrapositive. The first and fourth columns on the right of the next truth table 
show F P = Q0©<-Q > —P, while F P —- Q © Q > P is shown by the first 


and last columns. 


P Q|P>-@Q 7-Q =P 7-Q->-7-P Q-P 
T T T FF T T 
T F F T  F F T 
F T T FT T F 
F F T T T T T 


34 Chapter 1 PROPOSITIONAL LOGIC 


Syntactics 


If we limit our proofs to Inference Rules 1.2.10, we quickly realize that there will be 
little of interest that we can prove. We would have no reason on which to base such 
clear inferences as 

PFOVP 


or 
PVQ,7-OF P. 


To fix this, we expand our collection of inference rules with a new type. 
Suppose that we know that the form p A q can replace q A p at any time and vice 
versa. For example, in the propositional form 


PAQ>R, (1.4) 
P A Q can be replaced with Q A P so that we can write the new form 
OAPHR. (1.5) 


This type of rule is called a replacement rule and is written using the & symbol. For 
example, the replacement rule that allowed us to write (1.5) from (1.4) is 


PAGS QAP. 
Similarly, when the replacement rule 
“(Pp \q) = =pV7q 


is applied to P V (-Q Vv 7R), the result is P V =(Q A R). We state without proof the 
standard replacement rules. 


@ REPLACEMENT RULES 1.3.9 


Let p, gq, and r be propositional forms. 


e Associative Laws [Assoc] 
PAGArFS PAGAN) 
PVQVrepv(qvnr) 


¢ Commutative Laws [Com] 
PAGS? QAP 
PVGSqVpP 


¢ Distributive Laws [Distr] 
PA(Vr)SpAQGVpAr 
PVGArS(PVQA(PVr) 


¢ Contrapositive Law [Contra] 
poqeq7> 7p 


Section 1.3 REPLACEMENT 35 


Double Negation [DN] 
po 7p 


De Morgan’s Laws [DeM] 
“(pA q) & pV 7q 
“(pV q) @ 7p A7q 


Idempotency [Idem] 
PAPpSp 
pPVp@p 


Material Equivalence [Equiv] 
peqe(prqgaqg-p) 
POQdSsPpAGVpA| 


Material Implication [Impl] 
pP>?qao7PV4a 


Exportation [Exp] 
pAqrrep>-q-nr). 


A replacement rule is used in a formal proof by appealing to the next inference rule. 


M@ INFERENCE RULE 1.3.10 


For all propositional forms p and q, if p if obtained from q using a replacement 
rule, p > qandq > p. 


As with Inference Rules 1.2.10, the replacement rule used when applying Inference 
Rule 1.3.10 must be used exactly as stated. This includes times when it seems unnec- 
essary because it appears obvious. For example, P Vv QO does not follow directly from 
=P — Q using Impl. Instead, include a use of DN to give the correct sequence, 


aP ~O,-7-PVO,PVQ. 


Similarly, the inference rule Add does not allow for QV P to be derived from P. Instead, 
we derive P V Q. Follow this by Com to conclude Q v P. 

When writing formal proofs that appeal to Inference Rule 1.3.10, do not cite that 
particular rule but reference the replacement rule’s abbreviation using Replacement 
Rule 1.3.9 and the line which serves as a premise to the replacement. We use this 
practice in the following examples. 


M@ EXAMPLE 1.3.11 


Although it is common practice to move parentheses freely when solving equa- 
tions, inferences such as 


PAQA(RAS)FPAQAR)AS 


36 Chapter 1 PROPOSITIONAL LOGIC 


must be carefully demonstrated. Fortunately, this example only requires two ap- 
plications of the associative law. Using the boxes as a guide, notice that 


PAQ|IA( RIA S|) 


is of the same form as the right-hand side of the associative law. Hence, we can 
remove the parentheses to obtain 


PAQ|IA|RIA| S|. 


Next, view the propositional form as 


PAQARIA| S|. 


One more application within the first box yields the result, 


PA(QAR) S|. 


> 


We may, therefore, write a sequence of inferences by Inference Rule 1.3.10, 
PAQA(RAS)S>(PAQARAS>S[(PAQAR)AS, 


and we have a proof of [P A(Q A R)] A S from PAQ A(RA S): 


1. PAQA(RAS)_ Given 
2. PAQARAS 1 Assoc 
3. PA(QAR)AS_ 2 Assoc 


M@ EXAMPLE 1.3.12 
Prove: RA St aAR—- P 


1 RAS Given 
2. R 1 Simp 
3. RVP 2 Add 
4. 3a5RVP 3DN 
5. aAR—>P 4Impl 


@ EXAMPLE 1.3.13 
Prove: P>~ Q,R-~QEFPVR-@Q 


P-@Q Given 
R-Q Given 
(P > Q)A(R- Q) 1, 2 Conj 
(APVQ)AC7ARVQ) 3 Impl 
(QV7AP)A(QVAR) 4Com 


ee 


SO° OO SON 


QOVAPAAR 
APAARVO 
(Pv R)VQ 
PVR-@Q 


Section 1.3 REPLACEMENT 37 


5 Dist 

6 Com 
7 DeM 
8 Impl 


M@ EXAMPLE 1.3.14 


Prove: PAQVRASE(PVS)A(QV R) 


1. PAQVRAS Given 
2. (PAQVR)A(PAQVS) 1 Dist 
3. (RVPAQ)A(S VPAQ) 2 Com 
4. (RV P)A(RVO)AL[(S V P)ACS V Q)] 3 Dist 
5. (RVOQ)A(RVP)A[CS V P) ACS V Q)] 4Com 
6 (RVOQ)ACRV PJA[GS V P)AGS V Q)]) 5 Assoc 
7. RVOQ 6 Simp 
8 (SVP)ACSVO)A[(RVOQ)A(RYV P)] 5 Com 
9 (SV P)A(S VQ) 8 Simp 
10. SVP 9 Simp 
ll. (SV P)A(RVQ) 7, 10 Conj 
12. (PV S)A(QVR) 11 Com 


M@ EXAMPLE 1.3.15 


Botht P > Pandt Pv —P. Here is the proof for the second theorem: 


1 Po(P-P) 
2. aAPV(APV P) 
3. (APV-AP)VP 
4. aAPVP 
5. PV-7P 


FLI 

1 Impl 
2 Assoc 
3 Idem 
4 Com 


This proof can be generalized to any propositional form p, so that we also have 


- p> pandk pv -p. 


The theorem p V 7p is known as the law of the excluded middle, and the theorem 
=(p A 7p) is the law of noncontradiction. Notice that by De Morgan’s law with Com 
and DN, we have for all propositional forms p 


Exercises 


F (pV 7p) = *(pA7p). 


1. The inverse of a given implication is the contrapositive of the implication’s con- 
verse. Write the converse, contrapositive, and inverse for each conditional proposition 


in Exercise 1.1.2. 


38 Chapter 1 PROPOSITIONAL LOGIC 


2. Prove using truth tables. 


(a) 
(b) 


FPViAPoP->P 
FPVAPoPVQV-~(PAQ) 
FPAQS(PSQ)A(PVQ) 

FEF RA(P > Q)<@ RA(CO - AP) 
FPAQ>RoPAAR-7-OQ 
FP>Qec7Q-7P 
FPSQARS(P-O)A(P- R) 
FPsSQ >(S>R)e(P-QAS->R 


3. A propositional form is in disjunctive normal form if it is a disjunction of con- 
junctions. For example, the propositional form P A (-Q V R) is logically equivalent 


to 


(PAQA R)V(PA7OAAR)V(APAQAZWR), 


which is in disjunctive normal form. Find propositional forms in disjunctive normal 
form that are logically equivalent to each of the following. 


(a) 
(b) 
(c) 
(d) 


PVQA(P VAR) 

(PV O)ACPV-7Q) 

(PA7AQV R)A(OA RV PARR) 
PV(7AOQV[PAARV PA7Q)) 


4. Identify the rule from Replacement Rule 1.3.9. 


(a) 
(b) 


(P-QVQ-RVSS(P- AVA RIV S) 
AAP SO OARSPSCOAR 
PVOVRSOVPVR 

aAPA7A(QV R)S&-7(PV [QV R)) 
aAPVOQ)VR@SePVO-R 
PSeQSREPAOQ-R)VAPA-A(O - R) 
PVOSQAQAQSPVOASOQ 
PAQARSRA(PAQ) 

(PVQ)A(QV R)S(PVOAAQV(PVO)AR 
(PA[R->Q])~ S@P-(R-OQ>-S) 


5. For each given propositional form p, find another propositional form q such that 
Pp = q using Replacement Rules 1.3.9. 


(a) 
(b) 


a3P 

PV@Q 

P-Q 

“(PA Q) 

(P< Q)ACP <Q) 
(P-Q)vVQ-S) 
(P>-Q)vP 
-(P > Q) 
(P-QAQeR) 


Section 1.3 REPLACEMENT 39 


G) (PV7-Q)eTAQ 
(k) PVCQ+T)AQ 


6. Arrange each collection of propositional forms into a proof for the given deduction 
and supply the appropriate reasons. 

(a) PVOQ>REF(PSORAOQHR) 
e RVAPA-7AOQO 
e (APV R)ACOV R) 
e ~APVO)VR 
© (P>R)AQ->R) 
e (RV-AP)A(RV7Q) 
e aAPA7AQVR 
e PVO-R 

(b) ~PAQ)> RV S,7AP,3AS + R 
e APAQ)>RVS 
. AS 
° SVR 
° R 
* aP AQ) 
« aP 
° RVS 
e aPV-7AQ 

(c) P>(Q-> R),7P > S,70 —-T,R-ARFAT RO S 
© SVT 
°« ARVAR 
« R-oaAR 
e AR 
* aAP AQ) 
°° TVS 
- “TVS 
e aPV-7AQ 
e PAQ>R 
> P+(Q-R) 
e APS 
- 0 -T 
e «ATS 
© GP S)ACOQ>T) 


7. Prove. 
(a) "PF PQ 
(b) PF} 7AQ—>P 
(c) AOV(ARV-AP)F P->-7A(QA R) 
(d) PS>OFPAR-Q 


40 Chapter 1 PROPOSITIONAL LOGIC 


(y) 
(z) 


P>QARFP-Q 

PVQ>RFAR--@Q 
P>(Q>R)F}QA7AR—->7AP 
Ps-(Q>R)FQ>(P-R) 
PAQVRASE=AS > PAQ 
Q>RFP+-(Q-R) 
P->-(Q->R)FP-7AR 
PV(QVRVS)F(PVQ)V(RVS) 
QOVP>RASFQ>R 

PsoQARFP-Q 

PsoQVRFO-P 

PVOVR-SFO-S 
(PVQ)A(RVS)FPARVPASV(QARVQAS) 
PA(QVR) SP QARFP>-(Q- R) 
P°O,7"P' 70 

P>Q-->R,7ARF7=Q 
P>(Q>R),R-SATKFPO-(Q-T) 
P-(Q>R8),R-SVTE(P>-S)V(Q->T) 
P-~Q,P>RFP-OQAR 
PVQ>RAS,7P 3 (T ~ AT), ARE AT 

(P > Q)A(R-> S), PV R,(P — AS) A(R> AQF AeHAS 
PA(QAR),PAR>SV(TVM),AS AATF M 


1.4 PROOF METHODS 


The methods of Sections 1.2 and 1.3 provide a good start for writing formal proofs. 
However, in practice we rarely limit ourselves to these rules. We often use inference 
rules that give a straightforward way to prove conditional propositions and allow us to 
prove a proposition when it is easier to disprove its negation. In both the cases, the new 
inference rules will be justified using the rules we already know. 


Deduction Theorem 


Because of Axioms 1.2.8, not all of the inference rules are needed to write the proofs 
found in Sections 1.2 and 1.3. This motivates the next definition. 


@ DEFINITION 1.4.1 


Let p and q be propositional forms. The notation 


py 4 


means that there exists a formal proof of q from p using only Axioms 1.2.8, MP, 
and Inference Rule 1.3.10, and the notation 


rq 


Section 1.4 PROOF METHODS 41 


means that there exists a formal proof of q from Axioms 1.2.8 using only MP and 
Inference Rule 1.3.10 


For example, P,Q,=P Vv (Q > R)F,, R because 


1. P Given 
2. QO Given 
3. aAPV(Q—> R) Given 
4. P>~(Q>R) 3Impl 
5. QO-R 1,4 MP 
6 xR 2,5 MP 


is a formal proof using MP and Impl] as the only inference rules. 

We now observe that the propositional forms that can be proved using the full col- 
lection of rules from Sections 1.2 and 1.3 are exactly the propositional forms that can 
be proved when all rules are deleted from Inference Rules 1.2.10 except for MP. 


@ THEOREM 1.4.2 


For all propositional forms p and q, pF ,, q if and only if pF q. 


PROOF 
Trivially, p +, q implies p  q, so suppose that p F q. We show that the remain- 
ing parts of Inference Rules 1.2.10 are equivalent to using only Axioms 1.2.8, 
MP, and Replacement Rules 1.3.9. We show three examples and leave the proofs 
of the remaining inference rules to Exercise 6. The proof 


l. poq Given 
2. 7q Given 
3. =p > 77g DN 

4. mp > 77g > (ag 7p) FL3 

5. 73q7>7p 3,4 MP 
6. 7p 2, 5 MP 


shows that 
P>a7qhy 7p. 
Thus, we do not need MT. The proof 


1. p Given 
2. po (-q>p) FLI 

3. aq>p 1,2 MP 
4. =7qVp 3 Impl 
5. qVp 4DN 

6. pvVq 5 Com 


shows that 
pi, pV 4, 


42 Chapter 1 PROPOSITIONAL LOGIC 


so we do not need Add. This implies that we can use Add to demonstrate that 
Po gagrrh por 


The proof is as follows: 


l. poqg Given 
2.0 q>r Given 
3. n7qVr 2 Impl 
4. 7qVrvn7p 3 Add 
5. apV(-aqvr) 4 Com 
6 po(q-r) 5 Impl 
7 prqd>n>-@>q>Ppor) FL 

8 poqro>(por) 6, 7 MP 
9 por 1,8 MP 


This implies that we do not need HS. 


At this point, there is an obvious question: If only MP is needed from Inference 
Rules 1.2.10, why were the other rules included? The answer is because the other 
inference rules are examples of common reasoning and excluding them would introduce 
unnecessary complications to the formal proofs. Try reproving some of the deductions 
of Section 1.2 with only MP and the axioms to confirm this. 

When a formal proof in Section 1.3 involved proving an implication, the replacement 
rule Imp! would often appear in the proof. However, as we know from geometry, this 
is not the typical strategy used to prove an implication. What is usually done is that the 
antecedent is assumed and then the consequent is shown to follow. That this procedure 
justifies the given conditional is the next theorem. Its proof requires a lemma. 


@ LEMMA 1.4.3 
Let p and q be propositional forms. If F,. q, then ,, p > q. 
PROOF 
Let F,, g. By FL1, we have that, g > (p > q), so, p > q follows by MP. 


@ THEOREM 1.4.4 [Deduction] 


For all propositional forms p and q, p' q if and only if p > q. 


PROOF 
Let p and q be propositional forms. Assume that F p > q, so there exists propo- 
sitional forms ro, 71, ...,7,—, Such that 


T0.T].+++9ln-1>P 7 q 


is a proof, where rg is an axiom, r; iS an axiom OF ’9,r),...,7;-1 = Fr; fori > 0, 
and p > q follows from rg,rj,..., 7,1. Then, 


Section 1.4 PROOF METHODS 43 


PsTQ.T]5+++5ln-1>P 7 949 


is also a proof, where the last inference is due to MP. Therefore, p | q. 
By Theorem 1.4.2, to prove the converse, we only need to prove that 


if p,q, thenF, p > q. 


Assume p F,, q. First note that if q is an axiom, then F,, g, sor, p —> q by 
Lemma 1.4.3. Therefore, assume that q is not an axiom. We begin by checking 
four cases. 


Suppose that the proof has only one propositional form. In this case, we 
have that p = q, so the inference is of the form pF, p. By FLI, 


F, p> (p> p). 
Because 
p> (p> p) > pV (PY p) 
> (-pV-7p)V Pp 
> 71pVp 
> Pp ?p, 
we conclude that F,, p > p. 


Next, suppose the proof has two propositional forms and cannot be reduced 
to the first case. This implies that p F,, q by a single application of a 
replacement rule. Thus, 
p>? Cq> pF, p> Ca 4) 
by a single application of the same replacement rule. Therefore, 
p>Cqa7p>pr7cq>@g 
> p> (74qV 9) 
>p> (vq 
>p->q. 
This implies that, p > q. 


We now consider the case when the proof of p F,, g has three propositional 
forms and q follows by arule of replacement. Let p, r, q be the proof. This 
implies that p F,, r, which implies 


pert ae ar (1.6) 


because either r is an axiom and Lemma 1.4.3 applies or the previous two 
cases apply. If q follows from p by arule of replacement, then p > q by 


44 Chapter 1 PROPOSITIONAL LOGIC 


the previous case, so assume that q follows from r by a rule of replacement. 
Thus, F,, r > q, which implies by Lemma 1.4.3 that 


F,p>r 9). (1.7) 


By FL2, 
F,p>(r>qg>(o>rs>[p-@q)). (1.8) 
Therefore, by (1.7) and (1.8) with MP, 


F, por (p> 4q), (1.9) 
and by (1.6) and (1.9) with MP, 


ry, pq. 


Again, let the proof have three propositional forms and write it as p,s,q. 
Suppose that the inference that leads to q is MP. This means that r and 
r — q are in the proof. Because either p=randpt, r—>qorp=r—->q 
and pF ,, r, 

Fk, por 


and 
rep GF Sg) 


Thus, as in the previous case, using (1.8), we obtain ,, p > q. 


These four cases exhaust the ways by which q can be proved from p with a 
proof with at most three propositional forms. Therefore, since these cases can 
be generalized to proofs of arbitrary length (Exercise 7), we conclude that pF q 
implies p > q. @ 


The deduction theorem (1.4.4) yields the next result. Its proof is left to Exercise 8. 
M COROLLARY 1.4.5 


For all propositional forms po, Pj, ---, Py—1> 1 


Po> P1> tee »Pn-1>9 i r 
if and only if 


PO> Pi> +++ > Pn-1 is q7r. 


Direct Proof 
Most propositions that mathematicians prove are implications. For example, 
if a function is differentiable at a point, it is continuous at that same point. 


As we know, this means that whenever the function f is differentiable at x = a, it 
must also be the case that f is continuous at x = a. Proofs of conditionals like this 


Section 1.4 PROOF METHODS 45 


are typically very difficult if we are only allowed to use Inference Rules 1.2.10 and Re- 
placement Rules 1.3.9. Fortunately, in practice another inference rule is used. To prove 
the differentiability result, what is usually done is that f is assumed to be differential 
at x = a and then a series of steps that lead to the conclusion that f is continuous at 
x = a are followed. We copy this strategy in our formal proofs using the next rule. 
Sometimes known as conditional proof, this inference rule follows by Corollary 1.4.5 
and Theorem 1.2.14. 


@ INFERENCE RULE 1.4.6 [Direct Proof (DP)] 


For propositional forms po, Py, .--.Py—1>O"> 


if Po, P1>--->Pn-1-d Er, then po, Py,---,Py-1) BI. 


PROOF 
Suppose po, Pj, ---»Py—1>q Fr. Then, by Corollary 1.4.5, 


PorPis-s Ppa FQ 
Therefore, by Simp and Com, 
PoOAPL As **APrika>r, 
so by Theorem 1.2.14, 
POAPL AAPA > a> Pr. 
Finally, we have by Conj and Theorem 1.2.14 that 
Po Pir++>Prn-1 > 9 > 
To see how this works, let us use direct proof to prove 
PVQ>(RAS)F POR. 
To do this, we first prove 
PVQ>(RAS),PER. 


Here is the proof: 


1 PVQ->(RAS)_ Given 
2. 2P Given 
3. PVQ 2 Add 
4. RAS 1,3 MP 
5. 


R 4 Simp 


46 Chapter 1 PROPOSITIONAL LOGIC 


Therefore, by Inference Rule 1.4.6, 
PVQ>(RAS)>P-R. 


A proof of the original deduction can now be written as 


1 PVQ—->(RAS)_ Given 
2. POR 1 DP 


However, instead of writing the first proof off to the side, it is typically incorporated 
into the proof as follows: 


1 PVQ->(RAS)_ Given 

2. pP Assumption 
3 PVOQ 2 Add 

4. RAS 1,3 MP 

5 R 4 Simp 

6. POR 2-5 DP 


The proof that P infers R is a subproof of the main proof. To separate the proposi- 
tional forms of the subproof from the rest of the proof, they are indented with a vertical 
line. The line begins with the assumption of P in line 2 as an additional premise. 
Hence, its reason is Assumption. This assumption can only be used in the subproof. 
Consider it a local hypothesis. It is only used to prove P > R. If we were allowed to 
use it in other places of the proof, we would be proving a theorem that had different 
premises than those that were given. Similarly, all lines within the subproof cannot be 
referenced from the outside. We use the indentation to isolate the assumption and the 
propositional forms that follow from it. When we arrive at R, we know that we have 
proved P > R. The next line is this propositional form. It is entered into the proof 
with the reason DP. The lines that are referenced are the lines of the subproof. 


Mi EXAMPLE 1.4.7 
Prove: P > 7Q,7RVSERVOQ->(P-S) 


1. P-7AQ Given 
2. 7ARVS Given 
3. pRVO Assumption 
4. > P Assumption 
35 al) 1,4 MP 
6. OvVR 3 Com 
hs R 5,6 DS 
8. 34R 7DN 
9. S 2,8 DS 
10. PoS 4-9 DP 
11. (RVQ)>(P-S) 3-10DP 


Section 1.4 PROOF METHODS 47 


M EXAMPLE 1.4.8 
Prove: PAQ > R->S,7AOQVRES 


1 PAQ>R-S_ Given 
2. nQOVR Given 
3 PAQ Assumption 
4, OAP 3 Com 
3: OQ 4 Simp 
6 370 5 DN 
7 R 2,6DS 
8 PAQ>R 3-7 DP 
9 S 1,8 MP 
M@ EXAMPLE 1.4.9 
Prove: k P Vv =P 

l pP Assumption 

2. P—>P  1DP 

3. aPVP 2 Impl 

4. PV7P 3Com 


Note that lines 1-2 prove Fk P > P. 


Indirect Proof 


When direct proof is either too difficult or not appropriate, there is another common 
approach to writing formal proofs. Sometimes going by the name of proof by contra- 
diction or reductio ad absurdum, this inference rule can also be used to prove propo- 
sitional forms that are not implications. 


Mi INFERENCE RULE 1.4.10 [Indirect Proof (IP)] 


For all propositional forms p and q, 


7q > (pPA7p) > 4. 


PROOF 
Notice that instead of repeating the argument from Example 1.4.9 in this proof, 
the example is simply cited as the reason on line 2. 


aq > (pA 7p) Given 

pp Example 1.4.9 
-a(p Aap) > -77q ~~ «| Contra 
“(pAn7p) > 4 3 DN 


SONS 


48 


Chapter 1 PROPOSITIONAL LOGIC 


5. apvV734p>q 4DeM 
6. =pVvp->q 5 DN 

7. pop->q 6 Impl 
8 gq 2,7 MP 


The rule follows from Theorem 1.2.14. Hf 


To use indirect proof, assume each premise and assume the negation of the con- 
clusion. Then, proceed with the proof until a contradiction is reached. (In Inference 
Rule 1.4.10, the contradiction is represented by p A =p.) At this point, deduce the 
original conclusion. 


M@ EXAMPLE 1.4.11 
Prove: PVO > R,RVS>~APATEOP. 


1 PVQ->R Given 

2. RVS—>-73PAT_ Given 

3. peaP Assumption 

4 P 3 DN 

5: PvVQ 4 Add 

6. R 1,5 MP 

7 RVS 6 Add 

8 APAT 2,7 MP 

9 aP 8 Simp 
10 PAW7AP 4,9 Conj 
Ide -aP 3-10 IP 


Since IP involves proving an implication, the formal proof takes the same form 
as a proof involving DP. 


Indirect proof can also be nested within another indirect subproof. As with direct proof, 
we cannot appeal to lines within a subproof from outside of it. 


M@ EXAMPLE 1.4.12 
Prove: P> QAR,Q> S,APO SES. 


1. PoQAR Given 

2 QO-S Given 

3. aP-S Given 

4. pas Assumption 
5. 3Q 2,4 MT 

6. -P Assumption 
7. OAR 1,6 MP 

8. O 7 Simp 

9% QA7Q0_ 5,8 Conj 


10. aP 

11. S 

12. SAaS 
13. S$ 


Section 1.4 PROOF METHODS 49 
6-9 IP 

3, 10 MP 

4, 11 Conj 

4-12 IP 


Notice that line 11 was not the end of the proof since it was within the first sub- 
proof. It followed under the added hypothesis of =. 


M@ EXAMPLE 1.4.13 


Prove: P> RF PAQORVS. 


SO QO OW Ps oe 


POR 
?-PAQ 
aR 
aP 
P 
PA7AP 
R 
RvS 
PAQ->RVS 


Given 
Assumption 
Assumption 
1,3 MT 

2 Simp 

4,5 Conj 
3-6 IP 

7 Add 

2-8 DP 


Exercises 


1. Find all mistakes in the given proofs. 
(a) “PVQ>7R,R>~-Q>SVQES” 


Attempted Proof 
1. PVQ->AR Given 
2. R-~7AQ>SVQ_ Given 
3. Assumption 
4. Assumption 
3 1,4 MT 
6. 5 DeM 
7. 6 Simp 
8 R--7Q 3-7 DP 
9. SVO 2,8 MP 
10. OvsS 9 Com 
ll. S 7,10 DS 
(b) ““APVQFP>Q- R” 
Attempted Proof 
1. xAPVO Given 
2. PP Assumption 


50 Chapter 1 PROPOSITIONAL LOGIC 


3. a4P 2 DN 
4. Q 1,3 DS 
5. PO 2-4 DP 
6 R MP 

7. P>~Q->R 2-6DP 


(c) “ARAS,APVOQ > RE-APVO-O” 


Attempted Proof 
1. 7ARAS Given 
2. Given 
3. Assumption 
4. Assumption 
oF 2,3 MP 
6. 1 Simp 
7. 5, 6 Conj 
8. 4-7 IP 
9. 8 DN 
10. 4,9DS 
11. aPvQ->Q 3-10DP 


2. Prove using direct proof. 
(a) P>QARFP-@Q 
(b) PVO>REFPAR 
(c) PV(QOVR)~SFO-S 
(d) P~O,R-OQFPVR-O 
(ec) P~Q,P->REFPOQAR 
() P>(Q> R)FQA7R>-7P 
(g) RoASEFPAQ>(R-3WS) 
(h) P>(Q> R)FO->(P- R) 
Gi) Po(Q->R),QO07-(R-S)FPS-(Q-S) 
g) P-~(Q>R),R-SATEFPO>(Q->T) 
(k) PO QARFPQ 
qd) PAQ@VR>QARIFP->(Q- R) 


3. Prove using indirect proof. 
(a) P>O,QO7>R,7ARF7AP 
(b) PVOAR,P>S,Q>StES 
(c) PVOQAAR,P>S,Q->REFS 
(4d) PSO,7P+ 70 
(ec) PVQVR>QAREAPVQAR 
(ff) P>Q,Q>R,S3T,PVR,AREKT 
(g) P>77AQ,R--7S,T -~QO,U ~ S,PVRE-ATV-U 


Section 1.5 THE THREE PROPERTIES 51 


(h) PO QOAR,OVS>TAU,PET 


4. Prove using both direct and indirect proof. 
(a) P>3O,PV(R-S),7OF ROS 
(b) P3>-7(Q0 >7R)F POR 
(c) POP QFPAR-Q 
(I) PSOVRFO-P 


5. Prove by using direct proof to prove the contrapositive. 
(a) P~QO,R-S,S->T,7AQt AT > -7-(P Vv R) 
(b) PAQVRASEAS —> PAQ 
(c) PVQ>7R,S>REFPOAS 
(d) 7-P > 70, (ARV S)A(RVQ)E AS > P 


6. Prove to complete the proof of Theorem 1.4.2. 
(a) pVq,-pk. 4 
(b) p,q, pA 
(¢) PoQAT>Ss),pVvrh,qvVs 
(d) (P>QA > 5), 79dV 745 F, ap Vr 
(e) pA, Pp 


7. Given there is a proof of q from p with four propositional forms, prove F p > q. 
Generalize the proof for n propositional forms. 


8. Prove Corollary 1.4.5. 


9. Can MP be replaced with another inference rule in Definition 1.4.1 and still have 
Theorem 1.4.2 hold true? If so, find the inference rules. 


10. Can any of the replacement rules be removed from Definition 1.4.1 and still have 
Theorem 1.4.2 hold true? If so, how many can be removed and which ones? 


1.5 THE THREE PROPERTIES 


We finish our introduction to propositional logic by showing that this logical system has 
three important properties. These are properties that are shared with Euclid’s geometry, 
but they are not common to all logical systems. 


Consistency 


Since we can need to consider infinitely many propositional forms, we now write our 
lists of propositional forms as 


P0> P\>P2>-+-> 


allowing this sequence to be finite or infinite. Since proofs are finite, the notation 


Po» Pj> P2>-:- F q 


52 Chapter 1 PROPOSITIONAL LOGIC 


means that there exists a subsequence ig, i), ...,i,_, of 0,1,2,... such that 


Pig? Piy> +++ > Pi,_) rq. 


The notation 
Po>P1>P2.+-- ¥ q 


means that no such subsequence exists. 


MH DEFINITION 1.5.1 


¢ The propositional forms pp, p;, p>,... are consistent if for every propositional 
form q, 


Pos Pi» Pas» KQA79, 


and we write Con(po, Pp}, P2,--. ). Otherwise, po, Pj, P2,--. iS inconsistent. 


¢ A logical system is consistent if no contradiction is a theorem. 


We have two goals. The first is to show that propositional logic is consistent. The 
second is to discover properties of sequences of consistent propositional forms that will 
aid in proving other properties of propositional logic. The next theorem is important 
to meet both of these goals. The equivalence of the first two parts is known as the 
compactness theorem 


@ THEOREM 1.5.2 


If po, Pj, P2,--. are propositional forms, the following are equivalent in proposi- 
tional logic. 


* Con(po, P1,P2; ---)- 
¢ Every finite subsequence of po, Pp), p>, ... is consistent. 


¢ There exists a propositional form p such that po, pj, po, ... K p- 


PROOF 
We have three implications to prove. 


e Suppose there is a finite subsequence Dig? Piy> ++ > Pi that proves q A 7q 
for some propositional form q. This implies that there is a formal proof of 
qA7gq from po, Pj, P2,---, therefore, not Con(po, p}, pr; --- )- 


e Assume that po, pj, P2,... proves every propositional form. In particular, 
if we take a propositional form q, we have that po, p), Po, ... EK Ang. This 
means that there is a finite subsequence Dig? Piy> +++ > Pi,_, that proves qA7q. 


e Lastly, assume that there exists a propositional form q such that 


Po> P1> P2> see - q A 1. 


Section 1.5 THE THREE PROPERTIES 53 


This means that there exist subscripts ip, i,, ..., i,,_; and propositional forms 
0.11, +++ >lm—1 Such that 
Pig? Pi,> eee Pi, TOT 1D reel m-194d A 71q 


is a proof. Take any propositional form p. Then, 


Piy> Piy> see Pipl see ol m—-1> 122d A 1q, Pp 
is a proof of p by IP. Therefore, po, pj, P2,... K p. 


A sequence of propositional forms, such as P > Q, P, Q, although consistent, has 
the property that there are propositional forms that can be added to the sequence so that 
the resulting list remains consistent. When the sequence can no longer take new forms 
and remain consistent, we have arrived at a sequence that satisfies the next definition. 


@ DEFINITION 1.5.3 


A sequence of propositional forms po, p;, p2,-... is maximally consistent when- 
ever Con(po, Pj, P2,---) and for all propositional forms p, Con(p, po, Pj, Pr; ---) 
implies that p = p; for some i. 


It is a convenient result of propositional logic that every consistent sequence of 
propositional forms can be extended to a maximally consistent sequence. This is pos- 
sible because all possible propositional forms can be put into a list. Following Defini- 
tion 1.1.2, we first list the propositional variables: 


A, B, C, xX, Y, Z, 
Ag. Ags Ads 
Bis 2B By, 
Fis Bax. “Zo; 


Then, we list all propositional forms with only one propositional variable: 


nA, =B, -C,... 7X, AY, 7Z, 
7A, 4A), 4A), 
“Bo, 4B,, 4B), 


Zo, Zi, a2, 


Next, we list all propositional forms with exactly two propositional variables starting 
by writing A on the right: 


AvA, BVA, CVA, .... XVA, YVA, ZVA, 
AyV A, A, VA, AyV A, 
ByV A, B, VA, ByV A, 


ZyVA, Z,VA, ZVA, 


54 Chapter 1 PROPOSITIONAL LOGIC 


ANA, BAA, CAA, ... XAA, YAA, ZAA, 
AyA A, A, AA, AAA, 
By AA, B, AA, By AA, 


Z)AA, Z,AA, ZyAA, 


A-A, B-oA, C-A, ww XA, YoOA, ZA, 
Ap > A, A, > A, A, > A, 
Bo > A, B, ~ A, B, > A, 


Z) 7 A, Z, 7A, Z,—> A, 


AsA, BoA, CoA, ww XGA, YoA, Zod, 
Ap @ A, A, 3 A, A, 0 A, 
Bo @ A, B, oA, By, = A, 
Zc A, Z,0C A, 2,0 A, 


After following the same pattern by attaching A on the left, we continue by writing 
“=A on the right, and then on the left, and then we adjoin B and —B, etc., and then 
use 3 propositional variables, and then four, etc. Following a careful path through this 
infinite list, we arrive at a sequence 


90> 91> 92> +++ 
of all propositional forms. We are now ready for the theorem. 


@ THEOREM 1.5.4 


A consistent sequence of propositional forms is a subsequence of a maximally 
consistent sequence of propositional forms. 


PROOF 
Let po, Pj. P2,-.-. be consistent and go, q1, gz, ... be a sequence of all propositional 
forms. Define the sequence r; as follows: 


¢ Let rp = po and 


ne 90 if Con(qo; Po. P1> Pr tee ), 
; Po otherwise, 


so the sequence at this stage is po, Jo OF Po, Po. Both of these are consistent. 


¢ Let ry = p, and 


nae ill if Con(q1,19,.11,12» Po» Pi» P2> +++) 
: Pp; otherwise. 


Section 1.5 THE THREE PROPERTIES 55 


At this stage, the sequence is still consistent and of the form rg,7r), P1,q1 
Or ro,'}, P,P 1- The first sequence is consistent by the definition of r3, and 
the second sequence is consistent because Con(ro, r, Pj). 


¢ Generalizing, let ry, = p, and 


dy tL Con(gys Fos 11 «+» 12k Por Pls P2> +++ )s 


okt = . 
ef P, otherwise, 


resulting in a consistent sequence of the form 


Poo 1a0135 +++ > Peo Uk 
or 
'o.11,12,13, pee »Pk> Pk: 
Since po, Pj, P3,-.- is a subsequence of rp, 71,17, ..., it only remains to show that 


the new sequence is maximally consistent. 


¢ Let s be a propositional form such that rp,7r,,72,... F s Aas. This implies 
that there exists a sequence i; such that ig <i, <-+-+ <i, and 


Vigo Vijs ++ Vig FsAn7s, 


but by Theorem 1.5.2, this is impossible because Con(rp, 71, ..., ri). 


¢ Suppose that s is a propositional form so that Con(s,ro,71,12,-...). Write 
s = q; for some i. Therefore, 


Con(q;,o311> see oT; Po» P1> P2> cee ), 


which means that s is a term of the sequence r; because gq; was added at 
step 2i+ 1. 


Soundness 


We have defined two separate tracks in propositional logic. One track is used to assign 
T or F to a propositional form, and thus it can be used to determine the truth value of 
a proposition. The other track focused on developing methods by which one proposi- 
tional form can be proved from other propositional forms. These methods are used to 
write proofs in various fields of mathematics. The question arises whether these two 
tracks have been defined in such a way that they get along with each other. In other 
words, we want the propositional forms that we prove always to be assigned T, and 
we want the propositional forms that we always assign T to be provable. This means 
that we want semantic methods to yield syntactic results and syntactic methods to yield 
semantic results. 


56 Chapter 1 PROPOSITIONAL LOGIC 


Sound 


The statement form The statement form 


is a theorem. is a tautology. 


Complete 


Figure 1.4 Sound and complete logics. 


@ DEFINITION 1.5.5 


¢ A logic is sound if every theorem is a tautology. 


¢ A logic is complete if every tautology is a theorem. 


There is no guarantee that the construction of the two tracks for a logic will have these 
two properties (Figure 1.4), but it does in the case of propositional logic. 

To prove that propositional logic is sound, we need three lemmas. The proof of the 
first is left to Exercise 1. 


@ LEMMA 1.5.6 


The propositional forms of Axioms 1.2.8 are tautologies. 


M@ LEMMA 1.5.7 


Let p, q, and r be propositional forms. 
¢ Ifp =r, then p > risa tautology. 


° If p,q =r, then pAq > risa tautology. 


PROOF 
This is simply a matter of checking Inference Rules 1.2.10 and 1.3.10. 
For example, to check the theorem for De Morgan’s law, we must show that 


-(p Aq) > 7p V 74 


and 
“pV 7q > -P Aq) 
are tautologies. To do this, examine the truth table 


Pg | pAq AQ) pg pV Aq (PAD > PVG pV 7G > 7A | 
T T/ T F FF F T T 
T Fl] F T FT T ay T 
F T| F T T F T T T 
F F| F T T T T T T 


Section 1.5 THE THREE PROPERTIES 57 


We also have to show that 

“(pV q) > 7p An7q 
and 

=p A7q > “(pV q) 


are tautologies. 
As another example, to check that the disjunctive syllogism leads to an impli- 
cation that is a tautology, examining the truth table: 


P 4|PV@G —p (PYQAn7p (PVQAn7p>4 
Afi et: F T 
T Fi) T oF F T 
FT| T T T 7 
F F| FT F T 


The other rules are checked similarly (Exercise 2). 


@ LEMMA 1.5.8 


If p > q and p are tautologies, then q is a tautology. 


PROOF 
This is done by examining the truth table of p — q (page 12) where we see that 
v(q) is constant and equal to T because v(p) and v(p — q) are constant and both 
equal to T. 


We now prove the first important property of propositional logic. 


HM THEOREM 1.5.9 [Soundness] 


Every theorem of propositional logic is a tautology. 


PROOF 
Let p be a theorem. Let po, pj, ..., P,»_, be propositional forms such that 


PO> Pi> +++ > Pn-1 


is a proof for p such that p, is an axiom, p; with i > 0 is an axiom or follows 
by a rule of inference, and p,_; = p (Definition 1.2.13). We now examine the 
propositional forms of the proof. 


¢ By Lemma 1.5.6, po is a tautology. 


¢ The propositional form p, is a tautology for one of two reasons. If p, is 
an axiom, it is a tautology (Lemma 1.5.6). If it follows from po because 
Po > Pj, then py > p, is a tautology (Lemma 1.5.7), so p, is a tautology 
by Lemma 1.5.8. 


58 Chapter 1 PROPOSITIONAL LOGIC 


¢ If p follows from pg or p;, reason as in the previous case. Suppose po, pj > 
py. Then, by Lemma 1.5.7, (pp A pj) > pp is a tautology. Because 
Po: P1 > Po AP, by Conj, po A p; is a tautology. Thus, p> is a tautology by 
Lemma 1.5.8. 


Since every p; withi > is an axiom, follows from some p; with j < i, or follows 
from some p i> Pk with j,k < i, continuing in this manner, we find after finitely 
many steps that p is a tautology. Hi 


HM COROLLARY 1.5.10 
For all propositional forms po, pj, ---; Pr—1> 4 
if Po, Py> +++» Pp—-1 gq, then po, Py... Py) Fg. 


The Law of Noncontradiction being a theorem of propositional logic (page 37) suggests 
that we have the following result, which is the second important property of proposi- 
tional logic. 


HM COROLLARY 1.5.11 
Propositional logic is consistent. 


PROOF 
Let p be a propositional form. Suppose that pA —p is a theorem. This implies that 


it is a tautology by the soundness theorem (1.5.9), but p A —p is a contradiction. 
a 


Completeness 


We use the consistency of propositional logic to prove that propositional logic is com- 
plete. For this we need a few lemmas. 


He LEMMA 1.5.12 
If not Con(7q, po, Pj, Pr, --- ), then po, pj, Po,--. Fg. 


PROOF 
If not Con(po, Pp, P2,---), then po, pj, Po, ..- & q by Theorem 1.5.2, so suppose 
Con(po, P}, P2,--.). Assume that there exists a propositional form r such that 


7, Po, Pj, Pr.» Er An. 
This implies that there exists a formal proof 
Pig? Pi,> see Pi, MGs SQy S15 +2 9 Sm-19l AY, (1.10) 
where 7g is in the proof because Con(pp, pj, p>, ... ). Then, 


Pig? Pi,> see Pi, 4 


Section 1.5 THE THREE PROPERTIES 59 


is a proof, where g follows by IP with (1.10) as the subproof. Therefore, 


Po» P1>P2,---- q. 


@ LEMMA 1.5.13 


If po, P1,P2,--- are maximally consistent, then for every propositional form q, 
either g = p; or 7q = p, for some i. 


PROOF 
Since po, Pj, P2,--. are consistent, both gq and -q cannot be terms of the sequence. 
Suppose that it is —q that is not in the list. By the definition of maximal con- 
sistency, we conclude that 7g, po, pj, p2,... are not consistent. Therefore, by 
Lemma 1.5.12, we conclude that po, p;,p2,-.. F g, and since the sequence is 
maximally consistent, g = p, for some 7. 


To prove the next lemma, we use a technique called induction on propositional 
forms. It states that a property will hold true for all propositional forms if two condi- 
tions are met: 


e The property holds for all propositional variables. 


e If the property holds for p and q, the property holds for ap, p Aq, pV q, p > @q, 
and p © q. 


In proving the second condition, we first assume that 
the property holds for the propositional forms p and q. (1.11) 
This assumption (1.11) is known as an induction hypothesis. Because 
PVQo7p> 4, 


DAGS-7(p > 79), 
and 
peqge(p>7q) > 7p > Q), 


we need only to show that the induction hypothesis implies that the property holds for 
ap and p > q. 


@ LEMMA 1.5.14 
If Con(po, Pj, Po, --- ), there exists a valuation v such that 
v(p) = T if and only if p = p; 


for some i = 0,1,2,.... 


60 Chapter 1 PROPOSITIONAL LOGIC 


PROOF 
Since Con(po, pj, Po, --. ), we know that 


Po> P\>P2>--- (1.12) 


can be extended to a maximally consistent sequence of propositional forms by 
Theorem 1.5.4. If we find the desired valuation for the extended sequence, that 
valuation will also work for the original sequence, so assume that (1.12) is max- 
imally consistent. Let Xp, X,, X>,... represent all of the possible propositional 
variables (page 53). Define 


eae T if X; is a propositional variable of p; for some i, 
2 F otherwise. 


We prove that this is the desired valuation by induction on propositional forms. 
We first claim that for all j, 


W(X ;) = T if and only if Xj = Dj for some i. 


To prove this, first note that if X 3 is aterm of (1.12), then v_X 2) = T by definition 
of v. To show the converse, suppose that v(X;) = T but X; is not a term of 
(1.12). By Lemma 1.5.13, there exists i such that ~X; = p;. This implies that 
v(AX ) = T, and then v(X eS) = F (Definition 1.1.9), a contradiction. 

Now assume that 


v(q) = T if and only if q = p; for some i 
and 
v(r) = T if and only if r = p; for some i. 
We first prove that 
v(7q) = T if and only if sq = p; for some i. 


e Suppose that v(-~q) = T. Then, v(q) = F, and by induction, q is not 
in (1.12). Since po, pj, po, ... is maximally consistent, 4q is in the list 
(Lemma 1.5.13). 


¢ Conversely, let sg = p; for some i. By consistency, q is not in the se- 
quence. Therefore, by the induction hypothesis, v(q) = F, which implies 
that v(7q) = T. 


We next prove that 
v(q > r) = T if and only if g > r = p; for some i. 


e Assume that vig — r) = T. We have two cases to check. First, let 
v(q) = v(r) = T, so both q and r are in (1.12) by the induction hypoth- 
esis. Suppose g — r is not a term of the sequence. This implies that its 


Section 1.5 THE THREE PROPERTIES 61 


negation ~(q — r) is a term of the sequence by Lemma 1.5.13. Therefore, 
q A —r is in the sequence by maximal consistency, so 7r is also in the se- 
quence, a contradiction. Second, let v(q) = F. By induction, q is not a term 
of the sequence, which implies that —q is a term. Hence, 7g V r is in the 
sequence, which implies that q > r is also in the sequence. 


To prove the converse, suppose that gq > r = p, for some i. Assume 
v(q > r) = F. This means that v(q) = T and v(r) = F. Therefore, by 
induction, q is a term of the sequence but r is not. This implies that —r 
is in the sequence, so —q is in the sequence by MT. This contradicts the 
consistency of (1.12). Hi 


@ THEOREM 1.5.15 [Completeness] 


Every tautology of propositional logic is a theorem. 


PROOF 
Let p be a propositional form such that FL1, FL2,FL3 - p. By Lemma 1.5.12, 
Con(FL1, FL2, FL3, =p). Therefore, by Lemma 1.5.14, there exists a valuation 
such that v(p) = F, which implies that ¥ p. Hi 


H@ COROLLARY 1.5.16 


If po. Pj, P2.--- F q, then po, py, Po | q for all propositional forms po, pj, Po, ..-- 


We conclude by Theorems 1.5.9 and 1.5.15 and their corollaries that the notions 
of semantically valid and syntactically valid coincide for deductions in propositional 
logic. 


Exercises 
1. Prove Lemma 1.5.6. 
2. Provide the remaining parts of the proof of Lemma 1.5.7. 


3. Let p, po, Pj, P2,-.. be propositional forms. Prove the following. 

(a) If po, Pj, P2,-** ¥ p, then Con(mp, po, Pj, Pa, +++): 

(b) If Con(po, pj, P2,---) and po, Pj, P2,°+-> F p, then Con(p, po, Pp}, P2, ---)- 

(c) If Con(po, pj, p2,--.), then Con(p, po, Pi, P2,---) or Con(7p, po, Pj, P2, +++ )- 
4. Let po, Pj, P2,--. be a maximally consistent sequence of propositional forms. Let p 
and q be propositional forms. Prove the following. 

(a) If po, pj, P2,°** p, then p = p, for some k. 

(b) pAq= px for some k if and only if p = p; and q = p; for some i and j. 

(c) If(p > q) = p; and p = p; for some i and j, then q = p, for some k. 
5. Use truth tables to prove the following. Explain why this is a legitimate technique. 

(a) WAOV(ARV-7AP)F P > 7(QA R) 

(b) PVQ—>RE-AR--7Q 


62 Chapter 1 PROPOSITIONAL LOGIC 


(c) P>7A(Q> R)F/P->AR 
(d) PO QVRFQ-P 
(ec) PVQ>RAS,7AP — (T 3 AT), ARE AT 


6. Write a formal proof to show the following. Explain why this is a legitimate tech- 
nique. 

(a) 7>PVQ,7-QE-7P 

(b) ~(PAQ),PF-Q 

(c) P37 O,PEQVR 

(d) P7O,O>R,PER 

(ec) PVOAR,APER 


7. Write a formal proof to show the following. 
(a) ~(PAQ) FAP 
(b) (P>Q)V(R->S),PVRFOQVS 
(c) PVR,OVS,ROSEFRAS 
8. Write a formal proof to demonstrate the following. 


(a) pV —p isa tautology. 
(b) pA np isa contradiction. 


9. Modify propositional logic by removing all replacement rules (1.3.9). Is the result- 
ing logic consistent? sound? complete? 


10. Modify propositional logic by removing all inference rules (1.2.10) except for In- 
ference Rule 1.3.10. Is the resulting logic consistent? sound? complete? 


CHAPTER 2 


FIRST-ORDER LOGIC 


2.1 LANGUAGES 


We developed propositional logic to model basic proof and truth. We did so by us- 
ing propositional forms to represent sentences that were either true or false. We saw 
that propositional logic is consistent, sound, and complete. However, the sentences 
of mathematics involve ideas that cannot be fully represented in propositional logic. 
These sentences are able to characterize objects, such as numbers or geometric figures, 
by describing properties of the objects, such as being even or being a rectangle, and 
the relationships between them, such as equality or congruence. Since propositional 
logic is not well suited to handle these ideas, we extend propositional logic to a richer 
system. 


Predicates 


Consider the sentence it is a real number. This sentence has no truth value because the 
meaning of the word it is undetermined. As noted on page 3, the word it is like a gap 
in the sentence. It is as if the sentence was written as 


63 


A First Course in Mathematical Logic and Set Theory, First Edition. Michael L. O’ Leary. 
© 2016 John Wiley & Sons, Inc. Published 2016 by John Wiley & Sons, Inc. 


64 Chapter 2 FIRST-ORDER LOGIC 


is a real number. 


However, that gap can be filled. Let us make some substitutions for it: 


5 is a real number. 
x/7 is a real number. 
Fido is a real number. 
My niece’s toy is a real number. 


With each replacement, the word that is undetermined is given meaning, and then the 
sentence has a truth value. In the examples, the first two propositions are true, and the 
last two are false. 

Notice that changes, whereas is a real number remains fixed. This is because 
these two parts of the sentence have different purposes. The first is a reference to an 
object, and the second is a property of the object. What we did in the examples was to 
choose a property and then test whether various objects have that property: 


It is a real number. 
tT 
5 is a real number. True 
x/7 isa real number.| True 
Fido is a real number. False 
My niece’s toy isa real number.| False 


Depending on the result, these two parts are put together to form a sentence that either 
accurately or inaccurately affirms that a particular property is an attribute of an object. 
For example, the first sentence states that being a real number is a characteristic of 5, 
which is correct, and the last sentence states that being a real number is a characteristic 
of my niece’s toy, which is not correct. We have terminology for all of this. 


e The subject of a sentence is the expression that refers to an object. 


e The predicate of a sentence is the expression that ascribes a property to the object 
identified by the subject. 


Thus, 5, 2/7, Fido, and my niece’s toy are subjects that are substituted for it, and is a 
real number is a predicate. 

Substituting for the subject but not substituting for the predicate is a restriction that 
we make. We limit the extension of propositional logic this way because it is to be part 
of mathematical logic and this is what we do in mathematics. For example, take the 
sentence, 

x+2=7. 


On its own it has no truth value, but when we substitute x = 5 or x = 10, the sentence 
becomes a proposition. In this sense, x + 2 = 7 is like it is a real number. Both 
the sentences have a gap that is assigned a meaning giving the sentence a truth value. 


Section 2.1 LANGUAGES 65 


However, the mathematical sentence is different from the English sentence in that it is 
unclear as to what part is the subject. Is it x or x + 2? For our purposes, the answer is 
irrelevant. This is because in mathematical logic, the subject is replaced with a variable 
or, sometimes, with multiple variables. This change leads to a modification of what a 
predicate is. 


@ DEFINITION 2.1.1 


A predicate is an expression that ascribes a property to the objects identified by 
the variables of the sentence. 


Therefore, the sentence x + 2 = 7 is a predicate. It describes a characteristic of x. 
When expressions are substituted for x, the resulting sentence will be either a propo- 
sition that affirms that the value added to 2 equals 5 or another predicate. If x = 5 is 
substituted, the result is the true proposition 5+2 = 7. That is, x +2 = 7 is satisfied by 
5. Ifx = 10 and the substitution is made, the resulting proposition 10+2 = 7 is false. In 
mathematics, it is also common to substitute with undetermined values. For example, 
if the substitution is x = y, the result is the predicate y + 2 = 7, and if the substitution 
is x = sin@, the result is the predicate sin@ + 2 = 7. Substituting x = y + 2z? yields 
y +22? +2 =7, a predicate with multiple variables. If we substitute x = y? — 7y, the 
result is y? — 7y + 2 = 7, which is a predicate with multiple occurrences of the same 
variable. 

Assume that x represents a real number and consider the sentence 


there exists x such that x +2 = 7. (2.1) 


Although there is a variable with 2 occurrences, the sentence is a proposition, so in 
propositional logic, it is an atom and would be represented by P. This does not tell us 
much about the sentence, so we instead break the sentence into two parts: 


Quantifier Predicate 
J L 


there exists x suchthat |x+2=7 


A quantifier indicates how many objects satisfy the sentence. In this case, the quanti- 
fier is the existential quantifier, making the sentence an existential proposition. Such 
a proposition claims that there is at least one object that satisfies the predicate. In par- 
ticular, although other sentences mean the same as (2.1) assuming that x represents a 
real number, such as 


there exists a real number x such that x + 2 = 7, 
x+2=7 for some real number x, 
and 
some real number x satisfies x +2 =7, 


they all claim along with (2.1) that there is a real number x such that x + 2 = 7. This 
is true because 5 satisfies x + 2 = 7 (Figure 2.1). 


66 Chapter 2 FIRST-ORDER LOGIC 


The real numbers 
Satisfies x +2 =7 


There is at least one object that satisfies x + 2 = 7. Therefore, 
there exists a real number x such that x + 2 =7 is true. 


Figure 2.1 


Again, assume that x is a real number. The sentence 
forallx,x+5=5+x (2.2) 


claims that x +5 = 5 + x is true for every real number x. To see this, break (2.2) like 
(2.1): 


Quantifier Predicate 
L J 
for all x », |x+5=54+%x 


This quantifier is called the universal quantifier. It states that the predicate is satisfied 
by each and every object. Including propositions that have the same meaning as (2.2), 
such as 


for all real numbers x,x +5 =5+4+x, 
and 
x+5=5+-x for every real number x, 


(2.2) is true because each substitution of a real number for x satisfies the predicate 
(Figure 2.2). These are examples of universal propositions,. 

A proposition can have multiple quantifiers. Take the equation y = 2x” + 1. Before 
we learned the various techniques that make graphing this equation simple, we graphed 
it by writing a table with one column holding the x values and another holding the y 
values. We then chose numbers to substitute for x and calculated the corresponding y. 
Although we did not explicitly write it this way, we learned that 


for every real number x, 
there exists a real number y such that y = 2x? + 1. 


(2.3) 


This is a universal proposition, and its predicate is 


there exists a real number y such that y = 2x? + 1. 


Section 2.1 LANGUAGES 67 


The real numbers 
=ssy Satisfiesx+5=54+x 


Every object satisfies x +5 = 5+ x. Therefore, 
for all real numbers x, x +5 =5 + x is true. 


Figure 2.2 


We conclude that (2.3) is true because whenever x is replaced with any real number, 
there is a real number y that satisfies y = 2x? + 1. 

The symbolic logic that we define in this chapter is intended to model propositions 
that have a predicate and possibly a quantifier. As with propositional logic, an alphabet 
and a grammar will be chosen that enable us to write down the appropriate symbols 
to represent these sentences. Unlike propositional logic where we worked on both its 
syntax and semantics at the same time, this logic starts with a study only of its syntax. 


Alphabets 


No matter what mathematical subject we study, whether it is algebra, number theory, or 
something else, we can write our conclusions as propositions. These sentences usually 
involve mathematical symbols particular to the subject being studied. For example, 
both (2.1) and (2.2) are algebraic propositions. We know this because we recognize 
the symbols from algebra class. For this reason, the symbolic logic that we define will 
consist of two types of symbols. 


@ DEFINITION 2.1.2 


Logic symbols consist of the following: 


¢ Variable symbols: Sometimes simply referred to as variables, these sym- 
bols serve as placeholders. On their own, they are without meaning but 
can be replaced with symbols that do have meaning. A common example 
are the variable symbols in an algebraic equation. Variable symbols can 
be lowercase English letters x, y, and z, or lowercase English letters with 
subscripts, x, Y,, and z,,. Depending on the context, variable symbols are 
sometimes expanded to include uppercase English letters, with or without 
subscripts, as well as Greek and Hebrew letters. Denote the collection of 
variable symbols by VAR. 


e Connectives: 7, A, V, >, 0 


68 Chapter 2 FIRST-ORDER LOGIC 
¢ Quantifier symbols: V, 3 
e Equality symbol: = 


+ Grouping symbols: (, ), [, ], {. } 
Theory symbols consist of the following: 


¢ Constant symbols: These are used to represent important objects in the 
subject that do not change. Common constant symbols are 0 and e. 


e N-ary function symbols: The term n-ary refers to the number of argu- 
ments. For example, these symbols can represent unary functions that 
take one argument such as cosine or binary functions such as multiplica- 
tion that take two arguments. 


e N-ary relation symbols: These symbols are used to represent relations. 
For example, < represents the binary relation of less than and R can rep- 
resent the unary relation is an even number. 


It is not necessary to choose any theory symbols. However, if there are any, the- 
ory symbols must be chosen so that they are not connectives, quantifier symbols, 
the equality symbol, or grouping symbols, and the selection of theory symbols 
has precedence over the selection of variable symbols. This means that these two 
collections must have no common symbols. Moreover, although this is not the 
case in general, we assume that the logic symbols that are not variable symbols 
will be the same for all applications. On the other hand, theory symbols (if there 
are any) vary depending on the current subject of study. A collection of all logic 
symbols and any theory symbols is called a first-order alphabet and is denoted 
by A. 


The term theory refers to a collection of propositions all surrounding a particular 
subject. Since different theories have different notation (think about how algebraic 
notation differs from geometric notation), alphabets change depending on the subject 
matter. Let us then consider the alphabets for a number of theories that will be intro- 
duced later in the text. The foundational theory is the first example. 


@ EXAMPLE 2.1.3 


Set theory is the study of collections of objects. The € is the only relation sym- 
bol, and it is binary. It has no other theory symbols. The theory symbols of set 
theory are denoted by ST and are summarized in the following table. 


Constant symbols Function symbols Relation symbol 
€ 


ST 


Section 2.1 LANGUAGES 69 


Mi EXAMPLE 2.1.4 


Number theory is the study of the natural numbers. The symbols + and - rep- 
resent regular addition and multiplication, respectively. As such they are binary 
function symbols. These and its other theory symbols are indicated in the fol- 
lowing table and are denoted by NT. 


Constant symbols Function symbols Relation symbols | 


oF 0 1 + | 


Another approach to number theory is called Peano arithmetic (Peano 1889). 
It is the study of the natural numbers using Peano’s axioms. It has a constant 
symbol 0 and a unary function symbol S. Denote these symbols by AR. 


Constant symbol Functionsymbol Relation symbols 
0 S 


AR 


The Peano arithmetic symbols are sometimes extended to include symbols for 
the operations of addition and multiplication and the less-than relation. Denote 
this extended collection by AR’. 


Constant symbol Functionsymbols Relation symbol 
0 Sob < 


AR’ 


M@ EXAMPLE 2.1.5 


Group theory is the study of groups. A group is a set with an operation that satis- 
fies certain properties. Typically, the operation is denoted by the binary function 
symbol o. There is also a constant represented by e. Ring theory is the study 
of rings. A ring is a set with two operations that satisfy certain properties. The 
operations are usually denoted by the binary function symbols © and @. The 
constant symbol is ©. These collections of theory symbols are denoted by GR 
and RI, respectively, and are summarized in the following tables. 


GR Constant symbol Function symbol Relation symbols 
e ° 
RI Constant symbols Function symbols Relation symbol 


O QB 8 


Field theory is the study of fields. A field is a type of ring with extra properties, 
so the theory symbols RI can be used to write about fields. However, if an order is 
defined on a field, a binary relation symbol is needed, and the result is the theory 
of ordered fields. Denote these symbols by OF. 


Constant symbol Function symbols Relation symbol 


ial e) 2 ® < 


Notice that the collections of theory symbols in the previous examples had at most 
two constant symbols. This is typical since subjects of study usually have only a few 


70 Chapter 2 FIRST-ORDER LOGIC 


objects that require special recognition. However, there will be times when some extra 
constants are needed to reference objects that may or may not be named by the constant 
theory symbols. To handle these situations, we expand the given theory symbols by 
adding new constant symbols. 


HM DEFINITION 2.1.6 


Let A be a first-order alphabet with theory symbols S. When constant symbols 
not in S are combined with the symbols of S, the resulting collection of theory 
symbols is denoted by S. The number of new constant symbols varies depending 
on need. 


For example, suppose that we are working in a situation where we need four constant 
symbols in addition to the ones in OF. Denote these new symbols by c,, cy, c3, and c4. 


Then, OF consists of these four constant symbols and the symbols from OF. 


Terms 


For a string to represent a proposition or a predicate from a particular theory, each 
nonlogic symbol of the string must be a theory symbol of that subject. For example, 


Vx(ExA) 


and 
Vd € aby{} Axny 


are strings for set theory. However, some strings have a reasonable chance of being 
given meaning, others do not. As with propositional logic, we need a grammar that 
will determine which strings are legal for the logic. Because a predicate might have 
variables, the types of representations that we want to make are more complicated than 
those of propositional logic. Hence, the grammar also will be more complicated. We 
begin with the next inductive definition (compare page 59). 


@ DEFINITION 2.1.7 


Let A be a first-order alphabet with theory symbols S. An S-term is a string over 
A such that 


¢ a variable symbol from A is an S-term, 
* aconstant symbol from S is an S-term, 


° ftot,-+-+t,_1 18 an S-term if to, f),...,t,_, are S-terms and f is a function 
symbol from S. 


Denote the collection of strings over A that are S-terms by TERMS(A). 


The string ftot, ---t,,_, 18 often written as f(tp,t,,...,f,1) because it is common to 
write functions using this notation. We must remember, however, that this notation can 


Section 2.1 LANGUAGES 71 


always be replaced with the notation of Definition 2.1.7. Furthermore, when writing 
about a general S-term where it is not important to mention S, we often simply write 
using the word term without the S. We will follow this convention when writing about 
similar definitions that require the theory symbols S. 


M@ EXAMPLE 2.1.8 


Here are examples of terms for each of the indicated theories. Assume that x, y 
and z, are variable symbols. 


¢ O [Peano arithmetic] 
e +xy [number theory] 
¢ o(0z, [group theory]. 


The string +xy is typically written as x+y, and the string 00a, is typically written 
as 0 0 a. If NT is expanded to NT by adding the constants c and d, then +cd is 
an NT-term. 


As suggested by Example 2.1.8, the purpose of a term is to represent an object of study. 
A variable symbol represents an undetermined object. A constant symbol represents an 
object that does not change, such as a number. A function symbol is used to represent 
a particular object given another object. For example, the NT-term +x2 represents the 
number that is obtained when x is added to 2. 


Formulas 


As propositional forms are used to symbolize propositions, the next definition is the 
grammar used to represent propositional forms and predicates. The definition is given 
inductively and resembles the parentheses-less prefix notation invented by Lukasiewicz 
(1951) in the early 1920s. 


HM DEFINITION 2.1.9 


Let S be the theory symbols from a first-order alphabet A. An S-formula is a 
string over A such that 


° to =¢, is an S-formula if tg and ft, are S-terms. 


¢ Rot, ++ +t,—1 18 an S-formula if R is an n-ary relation symbol from S and 
to.ty,-+-5t,-1 are S-terms. 


e apis an S-formula if p is an S-formula. 
° DAG, PV q, p > q, and p © q are S-formulas if p and q are S-formulas. 


e Vxp and Axp are S-formulas if p is an S-formula and x is a variable symbol 
from A. 


72 Chapter 2 FIRST-ORDER LOGIC 


The string Riot, ---t,_1 is often represented as R(fp,t,,...,t,-,). A formula of 
the form Vxp is called a universal formula, and a formula of the form Axp is an 
existential formula. Parentheses can be used around an S-formula for readabil- 
ity, especially when quantifier symbols are involved. For example, Vxiyp is the 
same formula as Vx(Ayp) and VxAy(p). 


M@ EXAMPLE 2.1.10 


Let x and y be variable symbols. Let c be aconstant symbol; f, g, and A be unary 
function symbols; and R be a binary relation symbol from S. The following are 
S-formulas. 


*x=c 
° Refy 
° ay = gc) 
e Rxfx > Rfxx 
e Vxi(fx = fc) 
© AxVWRfxgy A Rfxhy). 
In practice, Rc fy is usually written as R(c, f(y)), 7(y = gc) as y F g(c), Rx fx 
as R(x, f(x)), Rfxx as R(f(x), x), and Rfxgy as R(f(x), g(y)). 


@ EXAMPLE 2.1.11 


These are some ST-formulas with their standard translations. 


7 €ex{} 
x is not an element of { }. 


Vx(ExA > ExB) 
For all x, if x € A, thenx € B. 


7dxVy(Eyx) 
It is not the case that there exists x such that for all y, y € x. 


x=yVEyxVExy 
x=yoryEexorxey. 


WM EXAMPLE 2.1.12 
Here are some NT-formulas with their corresponding predicates. 


© VxVyVz [++xyz = +x+yz] 
(x+y)+z2=x+(y+2Z) forall x, y, and z. 


° “(x = 0) > Vydz(z = -xy) 
If x #0, then for all y, there exists z such that z = xy. 


Section 2.1 LANGUAGES 73 


e Vx(-1x = x) 
For all x, \x =x. 


We now name the system just developed. 


@ DEFINITION 2.1.13 


An alphabet combined with a grammar is called a language. The language given 
by Definitions 2.1.2 and 2.1.9 is known as a first-order language. A formula 
of a first-order language is a first-order formula, and all of the formulas of the 
first-order language with theory symbols S is denoted by L(S). 


The first-order language developed to represent predicates, either with quantifiers 
or without them, is summarized in Figure 2.3. An alphabet that has both logic and 
theory symbols is chosen. Using a grammar, terms are defined, and then by extending 
the grammar, formulas are defined. A natural question to ask regarding this system is 
what makes it first-order? Look at the last part of Definition 2.1.9. It gives the rule that 
allows the addition of a quantifier symbol in a formula. Only Vx or 4x are permitted, 
where x is a variable symbol representing an object of study. Thus, only propositions 
that begin as for all x... or there exists x such that... can be represented as a first- 
order formula. To quantify over function and relation symbols, we need to define a 
second-order formula. Augment the alphabet A with function and relation symbols, 
which creates what is know as a second-order alphabet, and then add 


Vf p and 3f p are S-formulas if p is an S-formula 
and f is a function symbol from A 


and 


VRp and 3Rp are S-formulas if p is an S-formula 
and R is a relation symbol from A 


to Definition 2.1.9. For example, if the first-order formula 
Vx(ExA -ExB) 
is intended to be true for any A and B, it can be written as the second-order formula 
VAVBYVx(Ax > Bx), 


where Ax represents the relation x € A and Bx represents the relation x € B. 


Logic symbols fecal Grammar 
Frearyaymion | | Terms | > | Formas] Formulas 


Theory symbols Z| 

7 i 
Repent Properties 
Alphabet objects of objects 


Figure 2.3 A first-order language. 


74 Chapter 2 FIRST-ORDER LOGIC 


Exercises 


1. Determine whether the given strings are GR-terms. Ifa string is not a GR-term, find 
all issues that prevent it from being one. 

(a) 2e 

(b) xyo 

(c) 0 

(d) <xe 

(e) ooxye 


2. Write the GR-terms from Exercise | in their usual form (Example 2.1.10). 


3. Extend ST to ST’ by adding the constant symbol @ and the unary function symbol 
P. Determine whether the given strings are ST’-terms. If a string is not a ST’-term, 
find all issues that prevent it from being one. 

(a) Exo 

(b) @ 

(c) x 

(d) Pexy 

(e) Px 


4. Determine whether the given strings are OF-formulas. If a given string is not a 
OF-formula, find all issues that prevent it from being one. 

(a) < @xy@xy 

(b) ®x = @y 

(c) Vxdy(+xy = 0) 

(d) VxVy(@xy = yx) > <xy 

(e) 7A(<@xy A Oxy) & VWuVu(<@xy@uv) 


5. Write the RI-formulas from Exercise 4 in their usual form. 


6. Extend ST to ST” by adding the constant symbol @ and the binary relation symbol 
R. Determine whether the given strings are ST’’-formulas. If a given string is not a 
ST’’-formula, find all issues that prevent it from being one. 

(a) Rx@V eExy 

(b) Rx@ > E€4+xyz 

(c) VWxdS(7ExD) 

(d) Vx VVy V Vz(ESxyz) V EDS 

(e) [Ax(Rx@) > dudvaw(7Euv © 7Euw)| VS = x, 


7. Write the ST’’-formulas from Exercise 6 in their usual form. 


8. Translate the given sentence to an S-formula for the given theory symbol. 
(a) For all x, S(x) #0. (S = AR) 
(b) For every number x, there is a number y such that x o y= e. (S = GR) 
(c) Ifx < yand y < z, then x < z. (S = OF) 
(d) Itis false that x € y, y € z, and z € x forall x, y, and z. (S = ST) 


Section 2.2 SUBSTITUTION 75 


(e) For every u and v, there exists w such that if u = v, thenu+w=vu+w. 
(S = RI and + should be translated as @ in the RI-formula.) 


9. Extend NT to NT’ by adding the numerals 2, 3, ... ,9. Answer the following ques- 
tions. 

(a) Is +34 =7aNT’-term? Explain. 

(b) Is -4+39 a NT’-term? Explain. 

(c) Is dx(+-4x8 = +3x) a NT’-formula? If it is, find x. 

(d) If possible, give an example of a NT-formula that is not a NT’-formula. 

(e) If possible, give an example of a NT’-formula that is not a NT-formula. 


2.2 SUBSTITUTION 


As noted in the beginning of Section 2.1, there are times when a substitution will be 
made for a predicate’s variable. For example, in algebra, if f(x) = 3x* +x + 1, then 
f(y) =3y? +y41, fQ2) = 32)? +241, and f(sinx) = 3 sin’ x + sinx + 1. Thatis, 
we can substitute with variables, constants, and the results from functions. Therefore, 
to represent this in formulas, we can replace a variable with a term. 


Terms 


We begin by defining what it means to substitute for a variable in a term. We use & 
because one string can be replaced with the other string. 
Hi DEFINITION 2.2.1 [Substitution in Terms] 


Let S be theory symbols from a first-order alphabet A. Let x be a variable symbol 
from A and t be an S-term. The notation 


t 


x 


means that x is replaced with ¢ at every appropriate occurrence of x. For terms, 
this means the following. 


e If yis a variable symbol from A, 


¢ If c is aconstant symbol from S, 
t 
c— Sc. 
x 
If f is an n-ary function symbol from S and so, 5), ...,5,—1 are S-terms, 


t ( t t ~ ) 
SoS, oS, -& Sy—S;—°°°S, 1—-).- 
(Ff soS1 nD f Oy I\ n-ly 


76 Chapter 2 FIRST-ORDER LOGIC 
Observe that when a substitution of a term is made into a term, the result is another 
term. 


M@ EXAMPLE 2.2.2 


Let x, y, and z be distinct variable symbols; c and d be constant symbols; f be 
a binary function symbol; and g and h be unary function symbols. This means 
that fcggd is typically written as f(c, g(g(d))). 


gee gy 
x 
c 
*x- $c 
x 


[xz 


*y— ey 
x 


x 
ec- ec 
x 


e 
Poa 
Ye 
— 
< 
NX 
<|F 
Nn—— 
«|S 
¢ 
yw 
~~ 
9° 
N 


e e 
EO Ie 
| a ears NM ca eeees, | 
LS. OBS 
= N 

a eo 
< NS 
eee a 
eee 
~[8 SIS 
ey Se 
NS x [2 
rey OG 

| a 
Ss 

oO 
a 
a9 

Q 


Free Variables 


Substitution for a variable in a formula is a bit more involved. This is because of the 
influence of any possible quantifiers. For example, take the formula 


x=yVx=fy. (2.4) 


By Definition 2.2.1, we know that we can substitute constants c for x and d for y in the 
terms x, y, and fy. We expect that we should also be able to make this substitution 
into the formula resulting in 

c=dVc=fd. 


However, in the formula 
Vxdy(x=y V x= fy), (2.5) 


the situation is different because of the quantifiers. Consider the corresponding oc- 


Section 2.2 SUBSTITUTION 


currences in (2.4) and its quantified (2.5): 


Even though each occurrence of x and y in (2.4) can receive a substitution, each corre- 


x=yVx=fy 


ded 


Vxdy(x = y Vx =f y) 


sponding occurrence in (2.5) cannot because of the quantifiers. 


@ DEFINITION 2.2.3 


Let S be theory symbols. Let fg, t,,...,f,—, be S-terms, R be an n-ary relation 
symbol from S, and p and q be S-formulas. An occurrence of a variable in a 
formula is free or not free only according to the following rules: 


A variable occurrence in tg = t; and Rfot, -- -t,,_, is free. 


A variable occurrence in 7p is free if and only if the corresponding occur- 
rence in pis free. 


A variable occurrence in pA q, pV q, p > q, and p © q is free if and only 
if the corresponding occurrence in p or q is free. 


Any occurrence of x in Vxp and Axp is not free. 


Any occurrence of y # x is free in Vxp and xp if and only if the corre- 
sponding occurrence of y is free in p. 


If an occurrence of a variable is not free, it is bound. All free occurrences of 
x in p are within the scope of the universal quantifier in Vxp and the existential 
quantifier in 4xp. 


Mi EXAMPLE 2.2.4 


Let f be a unary function symbol and R be a 3-ary relation symbol. In the 
formula 


Vxdy(x =yVv fx=y— Rxyz), 


all occurrences of x and y are bound, but the occurrence of z is free. In the 
formula 


Ay(x =yVv fx=y— Rxyz), 


all occurrences of x and z are free, but the occurrences of y are bound. In 


x=yVfx=y- Rxyz, 


all occurrences are free because all occurrences are free in x = y, fx = y, and 
Rxyz. 


We need to know whether a formula has a free occurrence of a variable, so we make 


the next definition. 


78 Chapter 2 FIRST-ORDER LOGIC 


HM DEFINITION 2.2.5 


A free variable of the S-formula p is a variable that has a free occurrence in p. 


@ EXAMPLE 2.2.6 


Using the f and R from Example 2.2.4, both x and y are free variables of the 
formula 
Rxyc > Ax(fy =c), 


even though x has both free and bound occurrences. 


Formulas 
We can now define what it means to make a substitution into a formula. 


Hi DEFINITION 2.2.7 [Substitution in Formulas] 


Let S be theory symbols from a first-order alphabet A. Let x and y be variable 
symbols of A and R be an n-ary relation symbol from S. Suppose that p and q 
are S-formulas and ft, fp, t,,...,t,-, are S-terms. 


(t peg 8 #2 
e = _— = — 
ame a OY ls 


t $$ t 
(Riot ty — & Rtgrty— ++ + tyat 
x x Xx x 


L t 
(p)= & > (p=) 
x x 
i t t 
* (DAQ- &p-—Aq- 
x x x 


t t t 
* (pV Q)- &p-Vq- 
x x x 


t t t 
(p> qQ- eS p->q- 
x x x 


t t t 
(peog-Sp--q- 
x x x 


x 


t Vy (r+) if x # y and yis not asymbol of t 
(Vyp)- = 
x Vyp otherwise 


x 
dyp otherwise. 


t ay (0+) if x # y and yis not a symbol of t 
(dyp)- © 
x 
The condition on the term f¢ in the last two parts of Definition 2.2.7 is important. 
Consider the RI-formula 


p:=sjyWx By =O). 


Section 2.2 SUBSTITUTION 79 


The usual interpretation of pis that given x, there exists an number y such that x+y = 0. 
Let f be a unary function symbol and z be a variable symbol. Since y is not a symbol 
of z, 

p= © Ay(z ® y=O), 


which has the same standard interpretation as p. Since y is not a symbol of fz, we can 
substitute to find that 
Pz 


po & AW(fz@y =O). 


This is a reasonable substitution because it states that for the number given by fz, there 
is anumber y that when added to fz gives 0. This is very similar to the standard inter- 
pretation of p. Both of these substitutions work because the number of free occurrences 
is unchanged by the substitution. However, if we allow the term to include y among its 


symbols, the substitution pa would yield 
x 


Avy ® y= O). (2.6) 


The typical interpretation of (2.6) is that there exists a number y such that y+ y = 0. 
This is a reasonable proposition, but not in the spirit of the original formula p. The 
change of interpretation is due to the change in the number of free occurrences. The 
formula p has one free occurrence, while (2.6) has none. Therefore, when making a 
substitution, the number of free occurrences should not change, and for this reason, 


Ay(x @ y= Oe & Ay(x ® y=O). 


@ EXAMPLE 2.2.8 
Let p be the NT formula 
Vx9VxX (Xp =X, PO X99 t+ X. =X, + Xp). 


Notice that x, is a free variable of p. However, all occurrences of xq and x, are 
bound. Therefore, 


0 
P— & Vx9VX1 (Xp = X1 A Xp +X. =X, +X), 
x0 


and then letting y be a variable symbol, 


0 
(02) = & VxoVxX 1 (Xp = Xy > XQ Ht X. = Xz, +X), 
Xo/ XI 


and finally, 
}(o=) +| = & VxoVx{(xX9 =X, > Xp tlL=x, +1). 
Xo/ X14 X2 


This last formula has no free variables. A standard interpretation of this formula 
is 


80 Chapter 2 FIRST-ORDER LOGIC 


for every integer xq and x1, if xy = x1, thenxy+1=x,+1. 
Furthermore, 
x1 x1 
D— @ [Vx0Vx1 (Xp = X1 > XQ + XQ =X, + XQ) — 
x2 x2 


x 
& VxolVx1(X9 =X, PO Xp +X. =X, + x.) 
x 


2 
& VxoVx1(Xq =X, > Xp HX. =X, +X). 


Let p represent the NT-formula +y2 = 7. Observe that p has y as a free variable. To 
emphasize this, instead of writing 


p:= +y2=7, 
we often denote the formula by p(y) and write 
Py) = +y2=7. 
Although x is not a variable in the equation, we can also write 
q(x, y):= +y2=7. 


For example, if we wanted to interpret the formula as an equation with one variable, 
we would use p(y). If we wanted to view it as the horizontal line y = 5, we would use 
q(x, y). 


@ DEFINITION 2.2.9 


Let A be a first-order alphabet with theory symbols S. Let p be an S-formula. If 
p has no free variables or the free variables of p are among the distinct variables 
Xo, X1,---»X,_1 from A, define 


D(X, Xq5 0-5 Xp) & De 


The notation of Definition 2.2.9 can also be used to represent substitutions. Consider 
the formula p := x + y = 0. Observe that 


G3) “eyty=0 


x 


and 
(p=) Xe xt+x=0. 
x7’ y 


However, suppose that we want to substitute into p so that the result is y+ x = 0. To 
accomplish this, we need two new and distinct variable symbols, u and v. Then, 


Section 2.2 SUBSTITUTION 81 


(p4) 2 eu+v=0, (2.7) 
x7 y 
and then, 
(02) = ey+x=0. (2.8) 
u/ v 
Therefore, 


([(vt) 2]2) Serexeo 


This works because in (2.7), each variable symbol is replaced by a new symbol in such 
a way that the resulting formula has the same meaning as the original. In this way, when 
the original variable symbols are brought back in (2.8), all of the substitutions are made 
into distinct variables so that there are no conflicts and the switch can be made. This 
process can be generalized to any number of variable symbols in any order. 


@ DEFINITION 2.2.10 


Let p(xg, X1,-.-,X,—1) be an S-formula and up, u,,...,U,—; be distinct variable 
symbols not among xg, X1,...,X,_1- Define 
= ug \ 41 Un-l 
P(Ug, Uy, .-. Uy) & ( - (<2) “| +) —. 
Xo/ *1 Xn-1 
Then, for all S-terms tg, ft), ... ,¢,_1, define 


Hortityd & (| (72) “1)...) ie 
Ug J Uy Un] 


This is called a simultaneous substitution and is equivalent to replacing all free 
occurrences of Xo, X;,...,X,—, in p with fg, tj, ... ,t,_1, respectively. 


Observe that if p does not have free variables, then p(tp, 1), ...,t,1) @ P- 


M@ EXAMPLE 2.2.11 


To illustrate Definition 2.2.10, let RI’ be the theory symbols of RI combined with 
constant symbols 1 and 2. Let p be the RI’-formula 


XO(VOZ=xO@yO@x®z. 


82 Chapter 2 FIRST-ORDER LOGIC 


Since x, y, and z are free variables, represent p by p(x, y, z, w). Then, 


nissare((()2]2)2 
Uy J Ug) U3 / Ug 


= ({(I [uy @ (Uy Buz) = uy @ uy Puy @us| +) 5] 2)2 
uy J Uy) ug / WwW 
y\2 
@ (| @meu)-182618%) x|2)2 
uo|u,) w 
(Denon tosrie yaa 


#(1@ (xy) =1@xO1@y)= 
S1OXOyY=1OxO1@y. 


WM EXAMPLE 2.2.12 
Let S have constant symbols 5 and 9. Define p(x, y) to be the S-formula 
WxAz [g(x, y) A r(z)] V Ay Ir) > s@)]. 


The first occurrence of y is free, and since Vx applies only to variables within the 
brackets on the left, the last occurrence of x is also free. Therefore, p(x, y) is the 
disjunction of two formulas that we call u(y) and v(x): 


u(y) v(x) 
nl a 
Vxadz [q(x, y) Ar(2Z)] V AyIrQ) = s(x)], 
from which we derive 
PO, 5) & u(5) V V9) & Vx4dz[q(x,5) A r(z)] V Ay[r(Qy) > s(9)] 


As in Example 2.2.11, finding p(9,5) is equivalent to simply replacing the free 
occurrences of x with 9 and the occurrences of y with 5. 


M@ EXAMPLE 2.2.13 


Consider the NT-formula, 
VxVywVz [++xyz = +x+yz], (2.9) 
which is often written as 
VxVwz[(x+y)+z=x+(y+2z)]. (2.10) 
The formula within the scope of the first quantifier symbol in (2.10) is 


P(x) := VywVz[(x+y)+z=x4+(y4+2)]. (2.11) 


Section 2.2 SUBSTITUTION 83 
Notice that the occurrences of x are free in (2.11), but the occurrences of y and 
z are bound. For example, we can make the substitution 
P22) & VwWz[2+ y)+z=2+(y4+z)]. 


Now, letting 
q(x, y) :=Vz[(x + y)+z2=x+(y+z)], 


we have that p(x) = Vyq(x, y). We can also define 
x,y, Z):=(xt+y)4+z2=x4+(y+zZ) 


so that g(x,y) <= Vzr(x, y,z). What we have done is to break apart an NT- 
formula that contains multiple quantifiers into a sequence of formulas: 


D(x) 
—oqr 
VxVyVz[(x+y)4+z=x4+(y+2Z)]. 
a 
r(x, y, Z) 


a 
q(x, y) 


We conclude that (2.10) can be written as 
Vx p(x), 


VxVy q(x, y), 


or 
VxVyVz r(x, y, Z). 


Observe that (2.9) has no free variables. It is a type of formula of particular importance 
because it represents a proposition. In other words, (2.9) is a propositional form. 


M@ DEFINITION 2.2.14 


An S-formula with no free variables is called an S-sentence. 


Exercises 


1. Given a term, make the indicated substitution. 


84 Chapter 2 FIRST-ORDER LOGIC 


2. Given a formula, make the indicated substitution. 
@ [«<22]* 
xix 


(b) ([x<s+x+4<92]2)2 


0 ([(lo-ss-981) 19) 
xly/ x|u/Jov 
(d) (ia -4= vw 2=x4312) 2 


3. Identify all free occurrences of x in the given formulas. 
(a) x+4< 10 
(b) AxVy(x + y = 0) 
(c) VxVy(x+y=yt+x)VV2(x+y=2z-3) 
(d) Vx[VzVy(x + y=2-z)eOx+x=2-x] 
4. Identify all bound occurrences of x in the given formulas. 
(a) (Vx) [p(x) > (Ay)qQ)] 
(b) (Ax)p(x, y) > (Vy)qQx) 
(c) (Ax)(x>4Ax < 10 
(d) [(VWx)x+3 = 1l)Ax=9] Vv (Ay) < 0) 


5. Make the simultaneous substitution p(a) for each of the given formulas. 
(a) p(x) :=2x+a=9 
(b) p(x) :=Vx(x + y=y4+x) 
(c) p(x) := dxq(x) > Vyr(x, y) 
(d) p(x) :=x+4=O0AAxQV+z2=x)V Az(x +z = 3) 
6. Make the simultaneous substitution q(1, 2,3) for each of the given formulas. 
(a) q(x, y,Z):=u+tvut+w 
(b) q(x, y,z) := rx, y) > (Wx) [p(x) A (Ayty, 2). 
(c) q(x, y,Z):= dx(x + y = 2Z)AVy(x + y+ Z) V VxVy(x + y + 2) 
(d) q(x, y, z) = Ax[p(x) A Vy(p(x) A PY) > X = yV y= zZ)] 
7. Identify the formula to the right of each quantifiers in the given formulas. 
(a) (Vx) [p(x) > Cy)aQ)] 
(b) (Ax)p(x, y) > (Vy)q(y) 
(c) (Wx)(Ay)(Az)p(x, y, Z) A rw) 
(d) p(x) A (Ax)(Vy)ax, y) V (Vz)r(z) 
8. Which of the following are sentences. 
(a) VxVy[q(x) V r(y)] > Vylg(a) V r(y)] 
(b) g2) A t3) > Aylq2) A t(y)] 
(c) Aw[Ax[p(x) Vv Azq(z) @ Aylp(x) A q(y)] > Vxr(x)] > Vyrx)] 
(d) VxVy(p(x) > [4(y) A r(Z)]) > Ax7q(x) 


Section 2.3 SYNTACTICS 85 


2.3. SYNTACTICS 


Since formulas without free variables are propositional forms, we can write proofs 
involving them using the rules from Sections 1.2 through 1.4. However, since the in- 
ference rules did not involve quantification, we need new rules to deal with universal 
and existential formulas. We need rules covering not only negations but also rules that 
enable the removal (instantiation) and adjoining (generalization) of quantifiers. We 
add these rules to Definition 1.2.13 to obtain a stronger notion of proof. Furthermore, 
since every sentence is a propositional form, every reference to a propositional form in 
Definition 1.2.13 can be considered to be a reference to a sentence. This allows us to 
write formal proofs using first-order languages. 

Quantifier Negation 

Consider the proposition 


all rectangles are squares. 


This sentence is false because there is a rectangle in which one side is twice the length 
of the adjacent side, so 


not all rectangles are squares. 
That is, 
some rectangle is not a square 
is true. Generalizing, we conclude that 
the negation of all P are Q is some P are not Q. 
This should be translated as an inference rule for formulas, so we assume 
aAWxp > Axnp. (2.12) 
Now consider the proposition 
some rectangles are round. 
This is false because there are no round rectangles, so 
all rectangles are not round 
is true. Generalizing, we conclude that 
the negation of some P are Q is all P are not Q. 


Again, this should be translated as an inference rule for formulas, so we assume 


adxp > Vx-p. (2.13) 


86 Chapter 2 FIRST-ORDER LOGIC 


(“er +) 


Negations 


C ax p +) ( Sx ap ) 


Figure 2.4 The modern square of opposition. 


Furthermore, by DN and (2.13), 


dxap > 774x7p > Wx-77p > -7WVxp, 


and by DN with (2.12), 


Vxap > 77Vxap > 7Ax77p > 7Axp. 


We summarize assumptions (2.12) and (2.13) and the two conclusions in the following 
replacement rule. 


Mi REPLACEMENT RULES 2.3.1 [Quantifier Negation (QN)] 
Let S be theory symbols and p be an S-formula. 
e AVxp © Ax7p 


e AAxp & Vx7p. 


QN is illustrated with the modern square of opposition (Figure 2.4). Negations of 
quantified formulas are found at opposite corners. A version of the Square is found in 
Aristotle’s De Interpretatione, dating around 350 BC (Aristotle 1984). 

Whenever we negate formulas of the form Vxp or 4xp, to make it easier to read, 
the final form should not have a negation immediately to the left of any quantifier, and 
using the replacement rules, the negation should be as far into the formula p as possible. 
We say that such negations are in positive form. 


M@ EXAMPLE 2.3.2 


Find the negation of Vx(p A q) and put the final answer in positive form. 
AVx(p Aq) & Ax7(p A q) & Ax(Ap V 7q). 


The next example will use De Morgan’s law as the last one did. It also needs material 
implication and double negation. 


Section 2.3 SYNTACTICS 87 


M@ EXAMPLE 2.3.3 


Find the negation of Vxdy[p(x) > q(y)] and put it into positive form. 


Wx aAyl[p(x) > aQy)] & Ax7Jy[p(x) > aQ)] 
© AxVy-[p(x) > aQ)] 
= AxVy-[7p(x) V q(y)] 
= AxVy[p(x) A 7q(y)]- 


Proofs with Universal Formulas 


Consider the sentence all multiples of 4 are even. This implies, for instance, that 8, 
100, and —16 are even. To generalize this reasoning to formulas means that whenever 
we have Vxp(x), we also have p(a), where a might be either a constant symbol such as 
8, 100, and —16 or a randomly chosen constant symbol. This gives the first inference 
rule that involves a quantifier. We do not prove it, but take it to be an axiom. Observe 
the use of Definition 2.1.6 


Mi INFERENCE RULE 2.3.4 [Universal Instantiation (UD] 


If p(x) is an S-formula, then for every constant symbol a from S, 
Vxp(x) > p(a), 


We make two observations about UI. First, since the resulting formula is to be part 
of a proof, the substitution must yield a sentence, so a must be a constant symbol. 
Second, we use the notation p(x) to represent the formula instead of p because Vx p(x) 
will be part of a proof, which means, again, that it must be a sentence. Writing p(x) 
limits the formula to have only x as a possible free variable, so Vxp(x) is a sentence. 
If the formula p had free variables other than x, then Vxp would not be a sentence and 
not suitable for a proof. 


@ EXAMPLE 2.3.5 


Let p, q, and r be S-formulas and a and b be constant symbols from S. The 
following are legitimate uses of UI. Notice that each of the inferences results in 
an S-formula. 


* Vx [p(x) > a(x)] > p(a) > g(a) 

* Vx [p(x) V Vyq(y)] => p(@) V Vyq(y) 

* VxVy[g(x) V r(y)] > Vy [q(@) V r)] 
* Vy[q(@) V r(y)] > g(a) A r(a) 


* Vy [q(a) V r(y)] > g(a) Ar(b). 


88 Chapter 2 FIRST-ORDER LOGIC 


Before we can write formal proofs, we need a rule that will attach a universal quan- 
tifier. It will be different from universal instantiation, for it requires a criterion on the 
constant. 


@ DEFINITION 2.3.6 


Let a be a constant symbol introduced into a formal proof by a substitution. If 
the first occurrence of a is in a sentence that follows by UI, then a is arbitrary. 
Otherwise, a is particular and can be denoted by 4 to serve as a reminder that 
the symbol is not arbitrary. 


The idea behind Definition 2.3.6 is that if a constant symbol a is arbitrary, it repre- 
sents a randomly selected object, but if a is particular, then a represents an object with 
at least one known property. This property can be identified by a formula p(x) so that 
we have p(a). 


M@ EXAMPLE 2.3.7 


The constant symbol a in the following is not arbitrary because its first occurrence 
of ain line 1 is not the result of UI. 

1. p(a) Given 

2. pla)Vq(a) Add 


Therefore, these two lines should be written using 4 instead of a. 


1. p(a) Given 
2. p(a)Vq(a@) Add 


However, a in the following is arbitrary because its first occurrence is in line 2, 
and line 2 follows from line 1 by UI. 


1. Vxp(x) Given 
2. pla) 1UI 
3. pla)Vqa) Add 


We can now introduce the rule of inference that allows us to attach universal quan- 
tifiers to formulas. We state it as an axiom. 


Mi INFERENCE RULE 2.3.8 [Universal Generalization (UG)] 


If p(x) is an S-formula with no particular constant symbols and a is an arbitrary 
constant symbol from S, 
p(a) => Vxp(x). 


Consider the following argument: 


All squares are rectangles. 
All rectangles are quadrilaterals. 
Therefore, all squares are quadrilaterals. 


Section 2.3 SYNTACTICS 89 


Representing the premises by Vx[s(x) > r(x)] and Vx[r(x) > q(x)], a formal proof of 
this includes UG: 


1. Vx[s(x) > r(x)] Given 
2. Wx{[r(x) > q(x)] Given 
3. s(a) > r(a) 1 UI 

4. r(a) > q(a) 2 UI 

5. s(a) > q(a) 3,4 HS 
6. 


Vx[s(x) — q(x)] SUG 


Since a was introduced in line 3 by UI, it is an arbitrary constant symbol. In addition, 
s(a) — q(a) contains no particular constant symbols that appeared in the proof by 
substitution. Hence, the application of UG in line 6 is legal. 


Mi EXAMPLE 2.3.9 
These are illegal uses of universal generalization. 


e Let p(x) be an S-formula with c a constant symbol. 


1. p(c) Given 
2. Wxp(x) 1 UG [error] 


The constant symbol c in line | is particular, even without being written as 
¢é. It was not introduced to the proof by UI. Therefore, universal general- 
ization does not apply. 


Suppose that a is an arbitrary constant symbol. 


1. a+b6=0 
2. Wx(x+6=0) 1 UG [error] 


The restriction against p(x) containing particular symbols prevents the er- 
rant conclusion in line 2. 


The following is an attempt to prove VxVy(x + y = 2-x) from the formula 
Vx(x +x =2-x). 


Vx(x +x =2-x) Given 
ata=2-a 1 UI 
Vy¥aty=2-a) 2 UG [error] 
VxVyW(x+y=2-x) 3UG 


GN 


Although the constant symbol a in line 2 is arbitrary, the proof is not valid. 
The reason is that an illegal substitution was made in line 3. To see this, let 


P(x) = xtx=2-x. 


90 Chapter 2 FIRST-ORDER LOGIC 


Applying universal generalization gives 


Vy¥y+y=2-y) 


because p(y) @ y+ y =2- y, but this is not what was written in line 3. 


Now for some formal proofs. 


@ EXAMPLE 2.3.10 


Prove: VxVyp(x, y) Kk VyVxp(x, y) 


VxVyp(x, y) Given 
Vyp(a, y) 1UI 
P(a, b) 2 UI 
Vxp(x, b) 3 UG 
VyVxp(x,y) 4UG 


ne Oe. Nt 


Since both a and b first appear because of UI, they are arbitrary and universal 
generalization can be applied to both constant symbols. 


HM EXAMPLE 2.3.11 
Prove: Vx [p(x) > q(x)],Vx7[q(x) V r(x)] F Vx7p(x) 


1. Vx [p(x) > q(x)] Given 
2. Wx-al[q(x) Vr(x)] Given 
3.  p(a) > q(a) 1 UI 

4. n«a[q(a) Vv r(a)] 2 UI 

5. 1q(a) A 7r(a) 4 DeM 
6. n7q(a) 5 Simp 
7. -p(a) 3,6MT 
8. Vxap(x) 7UG 


Notice that the a in line 3 was introduced because of UI. Hence, a is arbitrary 
throughout the proof. 


Proofs with Existential Formulas 


Since 4+ 5 = 9, we conclude that there exists x such thatx +5 = 9. Since we 
can construct an isosceles triangle, we conclude that isosceles triangles exist. This 
motivates the next inference rule. 


Section 2.3 SYNTACTICS 91 


@ THEOREM 2.3.12 [Existential Generalization (EG)] 
If p(x) is an S-formula and a is a constant symbol from S, 
p(a) => Axp(x). 


PROOF 
Assume p(a). Suppose that Vxap(x). By UI we have that ap(a). Therefore, 
aVx-p(x) by IP, from which 4x p(x) follows by QN. 


M@ EXAMPLE 2.3.13 


Each of the following is a valid use of existential generalization. 
¢ pla) A ar(a) > Ax [p(x) A ar(x)] 


g(a) At(b) > Ay [q(a) A ty)]. 


Before we write some proofs, here is the inference rule that allows us to detach 
existential quantifiers. 


THEOREM 2.3.14 [Existential Instantiation (EI)] 


If p(x) is an S-formula, 
Axp(x) => p(4), 


where a is a constant symbol from S that has no occurrence in the formal proof 
prior to p(a). 
PROOF 

Assume that 4xp(x) does not infer p(b) for any constant symbol b. Suppose 
4x p(x). Combined with the assumption, this implies that sp(d) for every constant 
symbol b by the law of the excluded middle (page 37) and DS. That is, =p(a) for 
an arbitrary constant symbol a. Therefore, Vxap(x) by UG, so =Axp(x) by QN, 
which is a contradiction. Therefore, 4xp(x) => p(a) for some constant symbol a, 
and a is particular by Definition 2.3.6. Hi 


The constant symbol 4 obtained by EI is called a witness of Sxp(x). 

The reason that the constant symbol a must have no prior occurrence in a proof 
when applying EI is because a used symbol already represents some object, so if a had 
appeared in an earlier line, we would have no reason to assume that we could write p(a) 
from 4xp(x). For example, given 


Ax(x +4 = 5) (2.14) 


and 
dx(x + 2 = 13), (2.15) 


we write a + 4 = 5 for some constant symbol a by EI and (2.14). By EI and (2.15), 
we conclude that b + 2 = 13 for some constant symbol b, but inferring a + 2 = 13 is 
invalid because a # b. 


92 Chapter 2 FIRST-ORDER LOGIC 


M@ EXAMPLE 2.3.15 


The following are legal uses of existential instantiation assuming that a and b 
have no prior occurrences. 


© Ax [p(x) A q(x)] > p(a) A qa) 
* dyl[r(a, y,c) > r(a, y,c)] = r(a, b,c) > r(a, b,c) 
e AxVydzq(x, y, Zz) > Vydzq(G, y, Z). 


@ EXAMPLE 2.3.16 
Existential Instantiation cannot be used to justify either of the following. 


e 4z[p(z) V q(z)] does not imply p(b) V q(z) by EI because the substitution 
was not made correctly. The result should have been p(b) Vv q(b). 


e Axdyp(a, x, y) does not imply Ayp(a, a, y) by EI because a has a prior oc- 
currence. Also, notice that the hat notation was not used. 


M@ EXAMPLE 2.3.17 


Assume that x is a real number and consider the following. 


1. Vxdy(x+y=0) Given 

2. dAya+y=0) 1 UI 

3. atb=0 2EG 

4. x(x +6 =0) 3 UG [error] 
5. AwWx(x+y=0) 4EG 


The conclusion in line 5 is incorrect. The problem lies in line 4 where UG was 
applied despite the particular constant symbol in line 3 (compare Example 2.3.9). 
This example makes clear why there is a restriction on particular elements in UG 
(Inference Rule 2.3.8). Since b represents a particular real number, line 3 cannot 
be used to conclude that all real numbers plus that particular b equals 0. We 
know that b is the witness to line 2, but that is all we know about it. To correct 
the argument, we can essentially reverse the steps to arrive back at the premise. 


1. Vxdy(x+y=0) Given 

2. dAya+y=0) 1 UI 

3. at+b=0 2 EG 

4. Ay(x+y=0) 3 EG 

5. Vxdy(xt+y=0) 4UG 
Notice that there is no particular constant symbol in line 4, so line 5 does legally 
follow by UG. 


Here are some formal proofs that use existential instantiation and generalization. 


M EXAMPLE 2.3.18 
Prove: 4x [p(x) A q(x)], Vx [p(x) > r(x)] F Axr(x) 


Ml EXAMPLE 2.3.19 
Prove: Vxdy[q(x) A t(y)] F Vx [gQ) A (Ayt)] 


@ EXAMPLE 2.3.20 


Prove : p(a) > Ax [q(x) Ar(x)], p(a) F Axr(x) 


Section 2.3 SYNTACTICS 


PED OE Oe 


Ax [p(x) A q(x)] 
Vx [p(x) > r(x)] 
pe) A q(é) 

p(é) > r(é) 

p(é) 

r(é) 

dxr(x) 


Given 
Given 

1 EI 

2 UI 

3 Simp 
4,5 MP 
6EG 


SOOO ON es So 


Vxdy[q(x) A t(y)] 
Ay[q(a) A t(y)] 
q(a) A t(é) 

t(€) A q(a) 

1(é) 

Ayt(y) 

q(a) 

q(a) A (Ay)t(y) 

Vx [q(x) A y)t(y)] 


Given 
1UI 

2 EI 

3 Com 

4 Simp 

5 EG 

3 Simp 
6, 7 Conj 
8 UG 


93 


1. p(@) > Ax [q(x) Ar(x)] Given 
2.  p(@) Given 
3. Ax [q(x) Ar(x)] 1,2 MP 
4. q(b) Ar(b) 3 El 
5. r(b) A q(b) 4 Com 
6. r(b) 5 Simp 
7. Axr(x) 6 EG 
Exercises 
1. Use QN and other replacement rules to determine whether the following are legal 
replacements. 
(a) "Wxp(x) @ Vx7p(x) 
(b) 7Axp(x) & WVxp(x) 
(c) Vx-p(x) = Axp(x) 
(d) Ax [p(x) > q(x)]  7Vx [>p(x) > 79qx)] 
(e) ~WxAyp(x, y) <> AxVy7p(x, y) 


(f) 


AV xd yp(x, y) > AyWxp(x, y) 


94 Chapter 2 FIRST-ORDER LOGIC 


2. Negate and put into positive form. 
(a) Ax [q(x) > r(x)] 
(b) VxAy [p(x) A q(y)] 
(c) Axdy[pOX) Vv a(x, y)] 
(d) VxVy [p(x) V z)qQ, 2)] 
(e) Axar(x) V Vx [9(x)  =p(x)] 
(f) VxVydz(p(x) > [q(y) A r(Z))) > Axrq(x) 


3. Determine whether each pair of propositions are negations. If they are not, write 
the negation of both. 
(a) Every real number has a square root. 
Every real number does not have a square root. 
(b) Every multiple of four is a multiple of two. 
Some multiples of two are multiples of four. 
(c) For all x, if x is odd, then x? is odd. 
There exists x such that if x is odd, then x2 is even. 
(d) There exists an integer x such that x + 1 = 10. 
For all integers x, x +1 # 10. 


4. Write the negation of the following propositions in positive form and in English. 
(a) For all x, there exists y such that y/x = 9. 
(b) There exists x so that xy = 1 forall real numbers y. 
(c) Every multiple of ten is a multiple of five. 
(d) No interval contains a rational number. 
(e) There is an interval that contains a rational number. 


5. Let f be a function and c be a real number in the open interval J. Then, f is 
continuous at c if for every e > 0, there exists 6 > O such that for all x in J, if 
0 < |x -—c| <6, then | f(x) — f(c)| <e. 
(a) Write what it means for f to be not continuous at c. 
(b) The function f is continuous on an open interval if it is continuous at every 
point of the interval. Write what it means for f to be not continuous on an 
open interval. 


6. Let f bea function and c areal number in the open interval J. Then, f is uniformly 
continuous on J means that for every e > 0, there exists 6 > O such that for all c and 
xin I, if 0 < |x —c| <6, then | f(x) — f(c)| <e. 

(a) Write what it means for f to be not uniformly continuous on I. 


(b) How does f being continuous on J differ from f being uniformly continuous 
on I? 


7. Prove using QN. 
(a) Axp(x) F Vx7p(x) > Vxq(x) 
(b) Vx[p(x) > g(x], Vxlq(x) > rQx)], aWrr(x) F Axap(x) 
(c) Vxp(x) > Vyla(y) > ry), AxlgQ@x) A arQ)] F Axap(x) 
(d) Axp(x) > Ayq(y), Vx7q(x) F Vx7p(x) 


Section 2.3 SYNTACTICS 95 


8. Find all errors in the given proofs. 
(a) “Ax [p(x) V q(x)] , dxrg(x) F Axp(x)” 


Attempted Proof 

1. Ax [p(x) V q(x)] Given 
2. Axn7gq(x) Given 
3. p(c)V q(c) 1 EI 

4. -7q(c) 2 El 

5.  -p(c) 3,4DS 
6. Axn7p(x) 5 EG 


(b) “Wxp(x) F AxVy [p0x) Vv a)” 


Attempted Proof 

1. VWxp(x) Given 
2. p(é) 1 UI 
3. p(é)V qa) 2 Add 
4. Vy[p(@) Vv ay) 3 UG 
5. Aywx[p(x)VqQy)] 4EG 


(c) “Axp(x), dxg(x) F Vx [p(x) A qx) ]” 


Attempted Proof 

1. Axp(x) Given 

2. Axq(x) Given 

3. p(é) 1 El 

4. q(é) 2 El 

5. p(é)Aq(c) 3, 4 Conj 
6. Vx[p(x)Aq(x)] 5UG 


9. Prove. 
(a) Vxp(x) F Vx [p(x) Vv q(x)] 
(b) Vxp(x), Vx [q(x) > ap(x)] F Vx7q(x) 
(c) Vx [p(x) > q(x)], Vxp(x) F Vxq(x) 
(d) Vx [p(x) V q(x)], Vx7g(x) F Vxp(x) 
(e) Axp(x) F Ax [p(x) Vv q(x)] 
(f) AxAyp(x, y) F AyAxp(x, y) 
(g) WxVyp(x, y) F VyWxp(x, y) 
(h) Vx-p(x), 4x (q(x) > p(x)] F Ax7q(x) 
(i) Axp(x), Vx [p(x) > g(x)] F Axq(x) 
@) Vx [p(x) > q(x)], Vx [r(x) > s(x)], dx [pOx) V r(x)] F Ax [gx) Vv s(x) ] 
(k) Ax [p(x) A ar(x)], Vx [q(x) > r(x)] F Ax7q(x) 
(1) Vxp(x), Vx([p(x) V q(x)] > [r@x) A s(x)]) F Axs(x) 
(m) Axp(x), Sxp(x) > Vxdy[p(x) > q(y)] F Vxq(x) Vv Axr(x) 


96 Chapter 2 FIRST-ORDER LOGIC 


(n) Axp(x), dxq(x), Sxdy[p(x) A q(y)] — Vxr(x) F Vxr(x) 


10. Prove the following: 
(a) p(x) A Ayq(y) = Aylp(x) A q(y)] 
(b) p(x) V Ayq(y) = Aylp(x) V a(y)] 
(c) p(x) AVyq(y) = Vylp(x) A q(y)] 
(d) p(x) V Vyq(y) @ Vy[p(x) V a(y)] 
(e) Axp(x) > q(y) > Vx[p(x) > p(y)] 
(f) Vxp(x) > q(y) & Ax[p(x) > p(y)] 
(g) p(x) > Ayq(q) > Ay[P(x) > p)] 
(h) p(x) > Vyq(q) @ Vylp(x) > pQ)] 


11. An S-formula is in prenex normal form if it is of the form 


QoxyQ 1X1 --- On-1Xn-19, 


where Qp, Q,,...,Q,_ are quantifier symbols and gq is a S-formula. Every S-formula 
can be replaced with an S-formula in prenex normal form, although variables might 
need to be renamed. Use Exercise 10 and QN to put the given formulas in prenex 
normal form. 


(a) [p(x) V q(x)] > Ayla) > rQ)] 

(b) =Vx[p(x) > 7Ayq(y)] 

(c) Axp(x) > Vxdy[p(x) > qQ)] 

(d) Vx[p(x) > q(y)] A 75yVz[r(Q) > s(z)] 


2.4 PROOF METHODS 


The purpose of the propositional logic of Chapter 1 is to model the basic reasoning that 
one does in mathematics. Rules that determine truth (semantics) and establish valid 
forms of proof (syntactics) were developed. The logic developed in this chapter is an 
extension of propositional logic. It is called first-order logic. As with propositional 
logic, first-order logic provides a working model of a type of deductive reasoning that 
happens when studying mathematics, but with a greater emphasis on what a particular 
proposition is communicating about its subject. Geometry can serve as an example. 
To solve geometric problems and prove geometric propositions means to work with the 
axioms and theorems of geometry. What steps are legal when doing this are dictated by 
the choice of the logic. Because the subjects of geometric propositions are mathemati- 
cal objects, such as points, lines, and planes, first-order logic is a good choice. However, 
sometimes other logical systems are used. An example of such an alternative is second- 
order logic, which allows quantification over function and relation symbols (page 73) 
in addition to quantification over variable symbols. Whichever logic is chosen, that 
logic provides the general rules of reasoning for the mathematical theory. Since it has 
its own axioms and theorems, a logic itself is a theory, but because it is intended to 
provide rules for other theories, it is sometimes called a metatheory (Figure 2.5). 
Although all mathematicians use logic, they usually do not use symbolic logic. In- 
stead, their proofs are written using sentences in English or some other human language 


Section 2.4 PROOF METHODS 97 


First-order logic (metatheory) 


Mathematics 


Geometry 
(theory) 


Figure 2.5 First-order logic is a metatheory of mathematical theories. 


and usually do not provide all of the details. Call these paragraph proofs. Their in- 
tention is to lead the reader from the premises to the conclusion, making the result 
convincing. In many instances, a proof could be translated into the first-order logic, 
but this is not needed to meet the need of the mathematician. However, that it could 
be translated means that we can use first-order logic to help us write paragraph proofs, 
and in this section, we make that connection. 


Universal Proofs 


Our first paragraph proofs will be for propositions with universal quantifiers. To prove 
Vxp(x) from a given set of premises, we show that every object satisfies p(x) assuming 
those premises. Since the proofs are mathematical, we can restrict the objects to a 
particular universe. A proposition of the form Vxp(x) is then interpreted to mean that 
p(x) holds for all x from a given universe. This restriction is reasonable because we are 
studying mathematical things, not airplanes or puppies. To indicate that we have made 
a restriction to a universe, we randomly select an object of the universe by writing an 
introduction. This is a proposition that declares the type of object represented by the 
variable. The following are examples of introductions: 


Let a be a real number. 
Take a to be an integer. 
Suppose a is a positive integer. 


From here we show that p(a) is true. This process is exemplified by the next diagram: 


98 Chapter 2 FIRST-ORDER LOGIC 
Vx p(x) 


v 


Let a be an object in the universe. 


Prove p(a). 


These types of proofs are called universal proofs. 


M@ EXAMPLE 2.4.1 
To prove that for all real numbers x, 
(x — 1)? = x3 — 3x7 + 3x-1, 


we introduce a real number and then check the equation. 


PROOF 


Let a be a real number. Then, 

(a— 1) = (a- 1)(a- 1) 
=(a—1)(a’ —2a+1) 
=a — 3a’ +3a-1.8 

For our next example, we need some terminology. 


HM DEFINITION 2.4.2 


For all integers a and b, a divides b (written as a | b) if a 4 0 and there exists an 
integer k such that b = ak. 


Therefore, 6 divides 18, but 8 does not divide 18. In this case, write 6 | 18 and 8 { 18. 
If a | b, we can also write that b is divisible by a, a is a divisor or a factor of b, or b is 
a multiple of a. For this reason, to translate a predicate like 4 divides a, write 


a = 4k for some integer k. 
A common usage of divisibility is to check whether an integer is divisible by 2 or not. 
HM DEFINITION 2.4.3 
An integer n is even if 2 | n, and vis odd if 2 | (n — 1). 
We are now ready for the example. 


M@ EXAMPLE 2.4.4 


Let us prove the proposition 


Section 2.4 PROOF METHODS 99 


the square of every even integer is even. 
This can be rewritten using a variable: 


2 


for all even integers n, n“ is even. 


The proof goes like this. 
PROOF 


Let n be an even integer. This means that there exists an integer k such 
that n = 2k, so we can calculate: 


n = (2k)* = 4k? = 2(2k?). 
Since 2k? is an integer, n? is even. I 


Notice how the definition was used in the proof. After the even number was 
introduced, a proposition that translated the introduction into a form that was 
easier to use was written. This was done using Definitions 2.4.2 and 2.4.3. 


Existential Proofs 


Suppose that we want to write a paragraph proof for Sxp(x). This means that we must 
show that there exists at least one object of the universe that satisfies the formula p(x). 
It will be our job to find that object. To do this directly, we pick an object that we 
think will satisfy p(x). This object is called a candidate. We then check that it does 
satisfy p(x). This type of a proof is called a direct existential proof, and its structure 
is illustrated as follows: 


Ax p(x) 


Choose a candidate from the universe. 


v 
Check that the candidate satisfies p(x). 


Mi EXAMPLE 2.4.5 


To prove that there exists an integer x such that x? + 2x — 3 = 0, we find an 
integer that satisfies the equation. A basic factorization yields 


(x + 3)(x —- 1) =0. 
Since x = —3 or x = 1 will work, we choose (arbitrarily) x = 1. Therefore, 
(1)? + 2(1) -3 =0, 


proving the existence of the desired integer. 


100 Chapter 2 FIRST-ORDER LOGIC 


Suppose that we want to prove that there is a function f such that the derivative of f 
is 2x. After a quick mental calculation, we choose f(x) = x? as a candidate and check 
to find that 


— x? = 2x. (2.16) 


Notice that d/dx is a function that has functions as its inputs and outputs. Let d rep- 
resent this function. That is, d is a function symbol such that 


d 
T= et: 
x 
A formula that represents the proposition that was just proved is 


Af(d(f (x) = 2x). (2.17) 


This is a second-order formula (page 73) because the variable symbol x represents a 
real number (an object of the universe) and f is a function symbol taking real numbers 
as arguments. Although this kind of reasoning is common to mathematics, it cannot 
be written as a first-order formula. This shows that there is a purpose to second-order 
logic. It is not a novelty. 

EXAMPLE 2.4.6 


To represent (2.16) as a first-order formula, let d be a unary function symbol 
representing the derivative and write 


Vx[d(x?) = 2x]. 


Notice, however, that this formula does not convey the same meaning as (2.17). 


Multiple Quantifiers 


Let us take what we have learned concerning the universal and existential quantifiers 
and write paragraph proofs involving both. The first example is a simple one from 
algebra but will nicely illustrate our method. 


M@ EXAMPLE 2.4.7 


Prove that for every real number x, there exists a real number y so that x+y = 2. 
This translates to 


Vxdyw(x+y=2) 
with the universe equal to the real numbers. Remembering that a universal quan- 


tifier must apply to all objects of the universe and an existential quantifier means 
that we must find the desired object, we have the following: 


Section 2.4 PROOF METHODS 101 


Let x be a real number, Find a real number y. 


Vx dy (x+y = 2) 
» 


After taking an arbitrary x, our candidate will be y = 2 — x. 


PROOF 


Let x be a real number. Choose y = 2 — x and calculate: 


x+y=x+(2-x)=2.0 


Now let us switch the order of the quantifiers. 


Mi EXAMPLE 2.4.8 


To see that there exists an integer x such that for all integers y, x + y = y, we use 
the following: 


Find an integer x. Let y be an integer. 


dxVya@+y=y) 
» 


In the proof, the first goal is to identify a candidate. Then, we must show that it 
works with every real number. 


PROOF 


We claim that 0 is the sought after object. To see this, let y be an integer. 
ThenO+y=y. 0 


The next example will involve two existential quantifiers. Therefore, we have to find 
two candidates. 


M@ EXAMPLE 2.4.9 


We prove that there exist real numbers a and b so that for every real number x, 
(a+2b)x +(2a—b) =2x-6. 


Translating we arrive at 


102 Chapter 2 FIRST-ORDER LOGIC 


Find areal number a. Find a real number Db. 


a ae 
da ab vx{(a + 2b)x + 2a ~ b) = 2x — 6| 
; r 


Let x be a real number, Show (a + 2b)x + (2a — b) = 2x -&> 


We have to choose two candidates, one for a and one for b, and then check by 
taking an arbitrary x. 


PROOF 
Solving the system, 


a+2b=2, 
2a—b=-6, 
we choose a = —2 and b = 2. Let x be a real number. Then, 


(a + 2b)x + (2a — b) = (-2 +2 2)x + (2: (-2) -2) = 2x -6. 


Counterexamples 


There are many times in mathematics when we must show that a proposition of the 
form Vxp(x) is false. This can be accomplished by proving 4x—p(x) is true, and this 
is done by showing that an object a exists in the universe such that p(a) is false. This 
object is called a counterexample to Vxp(x). 
@ EXAMPLE 2.4.10 

Show false: 


x+2=7 for all real numbers x. 


To do this, we show that 4x(x + 2 # 7) is true by noting that the real number 0 
satisfies x + 2 4 7. Hence, 0 is a counterexample to Vx(x +2 = 7). 


The idea of a counterexample can be generalized to cases with multiple universal quan- 
tifiers. 
M@ EXAMPLE 2.4.11 


We know from algebra that every quadratic polynomial with real coefficients has 
a real zero is false. This can be symbolized as 


VaVbVcdx(ax? + bx +c = 0), 


where the universe is the collection of real numbers. The counterexample is found 
by demonstrating 
JaabacVx(ax? + bx +c #0). 


Section 2.4 PROOF METHODS 103 


Many polynomials could be chosen, but we select x? + 1. Its only zeros are i and 
—i. This means that a = 1, b= 0, and c = 1 is our counterexample. 


Direct Proof 


Direct Proof is the preferred method for proving implications. To use Direct Proof to 
write a paragraph proof identify the antecedent and consequent of the implication and 
then follow these steps: 


e assume the antecedent, 

e translate the antecedent, 

e translate the consequent so that the goal of the proof is known, 

e deduce the consequent. 
Our first example will use Definitions 2.4.2 and 2.4.3. Notice the introductions in the 
proof. 
WM EXAMPLE 2.4.12 

We use Direct Proof to write a paragraph proof of the proposition 
for all integers x, if 4 divides x, then x is even. 


First, randomly choose an integer a and then identify the antecedent and conse- 
quent: 


if | 4 divides x |, then| x is even}. 


Use these to identify the structure of the proof: 


_--C Assume the antecedent, 
4 divides a. «~~ 
This means a = 4k, <---~ Translate the antecedent. 
— 
an Translate the consequent. 
2 divides a.«........ 
ae Deduce the consequent. 


Now write the final version from this structure. 


PROOF 

Assume that 4 divides the integer a. This means a = 4k for some integer 
k. We must show that a = 2/ for some integer /, but we know that a = 
4k = 2(2k). Hence, let / = 2k. Hi 


Sometimes it is difficult to prove a conditional directly. An alternative is to prove 
the contrapositive. This is sometimes easier or simply requires fewer lines. The next 
example shows this method in a paragraph proof. 


104 Chapter 2 FIRST-ORDER LOGIC 


EXAMPLE 2.4.13 
Let us show that 
for all integers n, if n? is odd, then n is odd. 
A direct proof of this is a problem. Instead, we prove its contrapositive, 
if n is not odd, then n? is not odd. 
In other words, we prove that 
ifn is even, then n2 is even. 


This will be done using Direct Proof: 


_CAssume the antecedent. 
Let n be even 
This means n = 2k, <---- Translate the antecedent. 
Tow A. 
Se NOE: Translate the consequent, 
2 divides n?. <.. 
iat Deduce the consequent. 


This leads to the final proof. 


PROOF 
Let n be an even integer. This means that n = 2k for some integer k. To 
see that n” is even, calculate to find 


n = (2k)* = 4k? = 2(2k7). 


Notice that this proof is basically the same as the proof for the square of every 
even integer is even on page 99. This illustrates that there is a connection between 
universal proofs and direct proofs (Exercise 4). 


Existence and Uniqueness 


There will be times when we want to show that there exists exactly one object that 
satisfies a given predicate p(x). In other words, there exists a unique object that satisfies 
P(x). This is a two-step process. 


e Existence: Show that there is at least one object that satisfies p(x). 


¢ Uniqueness: Show that there is at most one object that satisfies p(x). This is 
usually done by assuming both a and b satisfy p(x) and then proving a = b. 


This means to prove that there exists a unique x such that p(x), we prove 
Axp(x) AVxVy(p(x) A p(y) > x = y). 


Use direct or indirect existential proof to demonstrate that an object exists. The next 
example illustrates proving uniqueness. 


Section 2.4 PROOF METHODS 105 


Mi EXAMPLE 2.4.14 


Let m and n be nonnegative integers with m 4 0. To show that there is at most 
one pair of nonnegative integers r and q such that 


r<mandn=mq+r, 


suppose in addition to r and gq that there exists nonnegative integers r’ and q’ such 
that 


r’ <mandn=mqd' +r’. 
Assume that r’ > r. By Exercise 13, q > q’, so there exists u, v > 0 such that 
r=rt+tuandq=q' +v. 
Therefore, 
mq’ +v)+r=mq'+rt+u, 
mq +mu+r=mq'+r+u, 
mv =U. 
Since v > 0, there exists w such that v = w+ 1. Hence, 
mw +m=m(w+ 1) = mv = 4, 
so m < u. However, since r < mand r’ < m, we have that u < m (Exercise 13), 
which is impossible. Lastly, the assumption r > r’ leads to a similar contradic- 
tion. Therefore, r = r’, which implies that g = q’. 
M@ EXAMPLE 2.4.15 
To prove 2x + 1 = 5 has a unique real solution, we show 
4x(2x +1 =5) 


and 
VxVy2xt+1=S5A2y+1=5>x=)y). 


e We know that x = 2 is a solution since 2(2)+ 1 =5. 


e Suppose that a and b are solutions. We know that both 2a + 1 = 5 and 
2b+ 1 =5, so we calculate: 


2a+1=2b+1, 
2a = 2b, 
a=b. 


Indirect Proof 


To use indirect proof, assume each premise and then assume the negation of the con- 
clusion. Then proceed with the proof until a contradiction is reached. At this point, 
deduce the original conclusion. 


106 Chapter 2 FIRST-ORDER LOGIC 


EXAMPLE 2.4.16 
Earlier we proved 
for all integers n, if n? is odd, then n is odd 


by showing its contrapositive. Here we nest an Indirect Proof within a Direct 


Proof: 
_-C Assume the antecedent. 
Assume n? is odd. 
Suppose n is even.<---------- Assume the negation. 
Feet phy gneiss Deduce a contradiction. 
Contradiction «~~ 
nis Odd. <«-----2.0. 
oan <Conclude the consequeni.> 


We use this structure to write the paragraph proof. 


PROOF 

Take an integer n and let n? be odd. In order to obtain a contradiction, 
assume that 1 is even. So, n = 2k for some integer k. Substituting, we 
have 


n = (2k)? = 2(2k?), 


showing that n? is even. This is a contradiction. Therefore, n is an odd 
integer. 


Indirect proof has been used to prove many famous mathematical results including the 
next example. 


WM EXAMPLE 2.4.17 


We show that 2 is an irrational number. Suppose instead that 
va=2 
b Ls 
where a and b are integers, b # 0, and the fraction a/b has been reduced. Then, 
a 
2 = fog 


so a” = 2b”. Therefore, a = 2k for some integer k [Exercise 11(c)]. We conclude 
that 52 = 2k?, which implies that b is also even. However, this is impossible 
because the fraction was reduced. 


There are times when it is difficult or impossible to find a candidate for an existential 
proof. When this happens, an indirect existential proof can sometimes help. 


Section 2.4 PROOF METHODS 107 


Mi EXAMPLE 2.4.18 


If n is an integer such that n > 1, then n is prime means that the only positive 
divisors of n are 1 and n; else n is composite. From the definition, if n is com- 
posite, there exist a and b such that n = ab and 1 < a < b < n. For example, 2, 
11, and 97 are prime, and 4 and 87 are composite. Euclid proved that there are 
infinitely many prime numbers (Elements IX.20). Since it is impossible to find 
all of these numbers, we prove this theorem indirectly. Suppose that there are 
finitely many prime numbers and list them as 


PO> P1> +++ > Pn-1- 


Consider 

4 = PoP\***Pn-1 + 1. 
If q is prime, q = p,; for some i = 0,1,...,2 — 1, which would imply that p, 
divides 1. Therefore, q is composite, but this is also impossible because q must 
have a prime factor, which again means that 1 would be divisible by a prime. 


Biconditional Proof 


The last three types of proofs that we will examine rely on direct proof. The first of 
these takes three forms, and they provide the usual method of proving biconditionals. 
The first form follows because p > q and q > p imply (p > q) A (q > p) by Conj, 
which implies p © q by Equiv. 


M@ INFERENCE RULE 2.4.19 [Biconditional Proof (BP)] 
P7497 P>Po|. 


The rule states that a biconditional can be proved by showing both implications. Each is 
usually proved with direct proof. As is seen in the next example, the p > q subproof is 
introduced by (—) and its converse with (<). The conclusions of the two applications 
of direct proof are combined in line 7. 


M@ EXAMPLE 2.4.20 


Prove: p> qk pAqep 


l. poq Given 
(3)2. pPpaq Assumption 
3 Pp 2 Simp 


(<-)4. pp Assumption 
2. q 1,4 MP 
6. PAQ 4, 5 Conj 


7. pAqep 2-6BP 


108 Chapter 2 FIRST-ORDER LOGIC 


Sometimes the steps for one part are simply the steps for the other in reverse. When 
this happens, our work is cut in half, and we can use the short rule of biconditional 
proof. These proofs are simply a sequence of replacements with or without the reasons. 
This is a good method when only rules of replacement are used as in the next example. 
(Notice that there are no hypotheses to assume.) 


WM EXAMPLE 2.4.21 


Prove: k p> q@pA7q > 7p 


prqe-pvq Impl 
© a3pV7pVvq Idem 
= pV (pv 4) Assoc 
© 7pVqVnp Com 
© 7pV77q V 7p DN 
2 1p A7q)V 7p DM 
© pA7q-> 7p Impl 


M@ EXAMPLE 2.4.22 


Let us use biconditional proof to show that 
for all integers n, n is even if and only if n? is even. 


Since this is a biconditional, we must show both implications: 


if|n is even|, then n> is even|, 


and 


if n> is even|, then|n is even}. 


To prove the second conditional, we prove its contrapositive. Therefore, using 
the pattern of the previous example, the structure is 


(>) Assume n is even 
Then, n = 2k for some integer k 


n> = 21, with / being an integer 
n> is even 
(<) Assume n is odd 


Then, n = 2k + 1 for some integer k 


n> = 2] + 1, with / being an integer 
n is odd 


Section 2.4 PROOF METHODS 109 


PROOF 


Let n be an integer. 


e Assume n is even. Then, n = 2k for some integer k. We show that 


n> is even. To do this, we calculate: 


nm = (2k) = 2(4k?), 


which means that n° is even. 


e Now suppose that n is odd. This means that n = 2k + | for some 
integer k. To show that n> is odd, we again calculate: 


w = (2k + 1) = 8k + 12k? + 6k +: 1 = 204K? + 6K? + 3k) +1. 


Hence, n? is odd. Hi 


Notice that the words were chosen carefully to make the proof more readable. Fur- 
thermore, the example could have been written with the words necessary and sufficient 
introducing the two subproofs. The (—) step could have been introduced with a phrase 
like 

to show sufficiency, 
and the (<-) could have opened with 


as for necessity. 


There will be times when we need to prove a sequence of biconditionals. The propo- 
sitional forms po, p},.-. ,P,—1 are pairwise equivalent if for all i, j, 


Pi @ Pj- 
In other words, 
Po = P\,Po => Po, see »Py = Po, Pj => Pz, see Dyno = Pn-}- 


To prove all of these, we make use of the Hypothetical Syllogism. For example, if we 
know that py > pj), P1 > P2, and py — po, then 


Po > Po (because py > Py A Py > Pp); 
P| > Do (because p; > py A py > Po); 
Po — Pp; (because py > po A Po > P})- 


The result is the equivalence rule. 


110 Chapter 2 FIRST-ORDER LOGIC 


Mi INFERENCE RULE 2.4.23 [Equivalence Rule] 


To prove that the propositional forms po, pj,...,P,—1 are pairwise equivalent, 
prove: 


Po > Pi>P1 > P2> +++» Pn-2 > Pn-1>Pn-1 > Po- 
In practice, the equivalence rule will typically be used to prove propositions that 


include the phrase 


the following are equivalent. 


WM EXAMPLE 2.4.24 


Let 


1 


f(x) = a,x" +a,_)x" +--+ +a,x+a9 


be a polynomial with real coefficients. That is, each a; is a real number and n is 
a nonnegative integer. An integer r is a zero of f(x) if 


Al” + dye) +++ + ayrt+ ag =0, 


written as f(r) = 0, and a polynomial g(x) is a factor of f(x) if there is a poly- 
nomial h(x) such that f(x) = g(x)h(x). Whether g(x) is a factor of f(x) or not, 
there exist unique polynomials q(x) and r(x) such that 


f(x) = g(x)q(x) + r(x) (2.18) 
and 
the degree of r(x) is less than the degree of g(x). (2.19) 


This result is called the polynomial division algorithm, The polynomial q(x) is 
the quotient and r(x) is the remainder. We prove that the following are equiv- 
alent: 


e risa zero of f(x). 
e ris asolution to f(x) =0. 
e x —risa factor of f(x). 


To do this, we prove three conditionals: 


os 


risa zero of f(x)|, then|r is a solution to f(x) =0}, 


ss 


r is a solution to f(x) =01, then| x — ris a factor of f(x) |, 


and 


if| x — r is a factor of f(x) |, then|r is a zero of f(x) |. 


Section 2.4 PROOF METHODS 111 


We use direct proof on each. 


PROOF 


Let a,x" +a,_;x"~! +++++a,x +g be a polynomial and assume that the 
coefficients are real numbers. Denote the polynomial by f(x). 


e Let r be a zero of f(x). By definition, this means f(r) = 0, sor isa 
solution to f(x) = 0. 


Suppose r is a solution to f(x) = 0. The polynomial division algo- 
rithm (2.18) gives polynomials q(x) and r(x) such that 


F(x) = qx)(x — 7) + rx) 


and the degree of r(x) is less than | (2.19). Hence, r(x) is a constant 
that we simply write as c. Now, 


0= f(r) =qnr-r)+c=0+c=c. 


Therefore, f(x) = q(x)(x —1r), so x — ris a factor of f(x). 


Lastly, assume x — r is a factor of f(x). This means that there exists 
a polynomial q(x) so that 


f(x) = (x — r)q(x). 
Thus, 
fr) =(r-rqr) = 0, 


which means r is a zero of f(x). Hi 


Proof of Disunctions 


The second type of proof that relies on direct proof is the proof of a disjunction. To 
prove pV q, it is standard to assume 7p and show q. This means that we would be using 
direct proof to show =p — q. This is what we want because 


pr qaempVqs pv gq. 


The intuition behind the strategy goes like this. If we need to prove p V q from some 
hypotheses, it is not reasonable to believe that we can simply prove p and then use 
Addition to conclude p V q. Indeed, if we could simply prove p, we would expect the 
conclusion to be stated as p and not pV q. Hence, we need to incorporate both disjuncts 
into the proof. We do this by assuming the negation of one of the disjuncts. If we can 
prove the other, the disjunction must be true. 


112 Chapter 2 FIRST-ORDER LOGIC 


HM EXAMPLE 2.4.25 
To prove 
for all integers a and b, if ab = 0, thena =O orb=0, 


we assume ab = 0 and show that a 0 implies b = 0. 


PROOF 


Let a and b be integers. Let ab = 0 and suppose a # 0. Then, a! exists. 
Multiplying both sides of the equation by a~! gives 


a'ab=a™'! -0, 


sob=0. 8 


Proof by Cases 


The last type of proof that relies on direct proof is proof by cases. Suppose that we 
want to prove p > q and this is difficult for some reason. We notice, however, that p 
can be broken into cases. Namely, there exist po, pj, ..., P,»—1 Such that 


PS PoV PL V++*V Py-1- 


If we can prove p; — q for each i, we have proved p > q. Ifn = 2, then p © py V py, 
and the justification of this is as follows: 


(Pp) > DAD > 9 & OPV DACP V4) 
= (dV po) ACG V 7p) 
OadVPoAP] 
2? PoAWPiV4 
> 7(p9 V Pi) V 4 
= PoV Pi > 4 


This generalizes to the next rule. 
M@ INFERENCE RULE 2.4.26 [Proof by Cases (CP)] 
For every positive integer n, if p @ py V pj V- ++ V p,_1, then 
Po > 4>P1 7 Go+++>Pn-1 7 4 >p > q. 


For example, since a is a real number if and only if a > 0, a = 0, ora < 0, if we 
needed to prove a proposition about an arbitrary real number, it would suffice to prove 
the result individually for a > 0,a = 0, anda < 0. 


Section 2.4 PROOF METHODS 113 


Mi EXAMPLE 2.4.27 


Our example of a proof by cases is a well-known one: 


for all integers a and b, if a = +b, then a divides b and b divides a. 


The antecedent means a = b or a = —b, which are the two cases, so we have to 
show both 


and 


if a = b, then a divides b and b divides a 


if a = —b, then a divides b and b divides a. 


This leads to the following structure: 


->Suppose a = +b 
Thus, a= bora=-—b 


(Case 1) > Assume a = b 
a divides b and b divides a 


(Case 2) — Assume a = —b 


a divides b and b divides a 
..a divides b and b divides a 


The final proof looks something like this. 
PROOF 


Let a and b be integers and suppose a = +b. To show a divides b and b 
divides a, we have two cases to prove. 


e Assume a = b. Then,a=b-landb=a-1. 


e Next assume a = —b. This means that a = b- (—1) and b= a- (-1). 


In both the cases, we have proved that a divides b and b divides a. MH 


Exercises 


1. Let m be an integer. Demonstrate that each of the following are divisible by 6. 


(a) 
(b) 


18 

—24 

0 

6n + 12 
23.34.72 

(2n + 2)(3n + 6) 


2. Let a be a nonzero integer. Write paragraph proofs for each proposition. 


114 Chapter 2 FIRST-ORDER LOGIC 


(a) 
(b) 
(c) 


1 divides a. 
a divides 0. 
a divides a. 


3. Write a universal proof for each proposition. 


(a) 
(b) 
(c) 


For all real numbers x, (x + 2)? = x2 +4x +4. 
For all integers x, x — 1 divides x? — 1. 
The square of every even integer is even. 


4. Show why a proof of (Vx)p(x) is an application of direct proof. 


5. Write existential proofs for each proposition. 


(a) 
(b) 
(c) 


There exists a real number x such that x — 7 = 9. 
There exists an integer x such that x? + 2x —3 = 0. 
The square of some integer is odd. 


6. Prove each of the following by writing a paragraph proof. 


(a) 
(b) 
(c) 
(d) 
(e) 


(f) 


For all real numbers x, y, and z, x7z + 2xyz+ yz = z(x + y)*. 


There exist real numbers u and v such that 2u + Sv = —29. 

For all real numbers x, there exists a real number y so that x — y = 10. 
There exists an integer x such that for all integers y, yx = x. 

For all real numbers a, b, and c, there exists a complex number x such that 
ax? +bx+c=0. 

There exists an integer that divides every integer. 


7. Provide counterexamples for each of the following false propositions. 


(a) 
(b) 
(c) 
(d) 


Every integer is a solution tox + 1=0. 
For every integer x, there exists an integer y such that xy = 1. 
The product of any two integers is even. 
For every integer n, if n is even, then n? is a multiple of eight. 


8. Assuming that a, b, c, and d are integers with a #4 0 and c ¥ O, give paragraph 
proofs using direct proof for the following divisibility results. 


(a) 
(b) 
(c) 
(d) 


If a divides b, then a divides bd. 

If a divides b and a divides d, then a” divides bd. 
If a divides b and c divides d, then ac divides bd. 
If a divides b and b divides c, then a divides c. 


9. Write paragraph proofs using direct proof. 


(a) 
(b) 
(c) 
(d) 
(e) 
(f) 


The sum of two even integers is even. 

The sum of two odd integers is even. 

The sum of an even and an odd is odd. 
The product of two even integers is even. 
The product of two odd integers is odd. 
The product of an even and an odd is even. 


10. Let a and b be integers. Write paragraph proofs. 


11. 


12. 


13. 
14. 
15. 


16. 


17. 


18. 


19. 


Section 2.4 PROOF METHODS 115 


(a) Ifa and bare even, then a* + b* + 32 is divisible by 8. 
(b) Ifa and bare odd, then 4 divides at + b+ + 6. 


Prove the following by using direct proof to prove the contrapositive. 
(a) For every integer n, if n* is even, then n is even. 
(b) For all integers n, if n> +n? is odd, then n is odd. 
(c) For all integers a and b, if ab is even, then a is even or b is even. 


Write paragraph proofs. 
(a) The equation x — 10 = 23 has a unique solution. 


(b) The equation 72x — 5 = 2 has a unique solution. 
(c) For every real number y, the equation 2x + Sy = 10 has a unique solution. 
(d) The equation x* + 5x + 6 = 0 has at most two integer solutions. 


From the proof of Example 2.4.14, prove that g > q’ and u < m. 
Prove the results of Exercise 9 indirectly. 


Prove using the method of biconditional proof. 

(a) pVqroaw,sor,apVqaoskpenrs 

(b) pV (“dV p),4V (pV g)F peg 

(cc) Pe QAT > S5),(p>75)A > 79g), pVrEqgaons 
(d) pAr>-7A(sVt)3asVat>pArFs<ot 


Prove using the short rule of biconditional proof. 
(a) p> qArepr>QgAwrr) 
(b) poqVrepAqor 
(C) p¥Vqrore(rrnagqg-nr) 
(4) pAq>mre(ponva-nr) 
Let a, b,c, and d be integers. Write paragraph proofs using biconditional proof. 
(a) ais even if and only if a’ is even. 
(b) ais odd if and only if a + 1 is even. 
(c) ais even if and only if a+ 2 is even. 
(d) a? +a* +a is even if and only if a is even. 
(e) Ifc #0, then a divides b if and only if ac divides bc. 


Suppose that a and b are integers. Prove that the following are equivalent. 
° a divides b. 
e adivides —b. 
e —a divides b. 
e —a divides —b. 


Let a be an integer. Prove that the following are equivalent. 
¢ ais divisible by 3. 
¢ 3a is divisible by 9. 
¢ a+3 is divisible by 3. 


116 Chapter 2 FIRST-ORDER LOGIC 


20. Prove by using direct proof but do not use the contrapositive: for all integers a and 
b, if ab is even, then a is even or b is even. 


21. Prove using proof by cases (Inference Rule 2.4.26). 
(a) For all integers a, if a = 0 or b = 0, then ab = 0. 
(b) The square of every odd integer is of the form 8k + 1 for some integer k. 
(Hint: Square 2/ + 1. Then consider two cases: / is even and / is odd.) 
(c) a? +a+1 is odd for every integer a. 


(d) The fourth power of every odd integer is of the form 16k +1 for some integer 
k. 


(e) Every nonhorizontal line intersects the x-axis. 


22. Let a be an integer. Prove by cases. 
(a) 2 divides a(a + 1) 
(b) 3 divides a(a+ 1)(a + 2) 


23. For any real number c, the absolute value of c is defined as 


ec ifc>0, 
|e] = 
-—c ifce <0. 
Let a be a positive real number. Prove the following about absolute value for every real 
number x. 


(a) |— x] = |x| 
(b) x7] = |x|? 
(c) x < |x 


(d) |xyl = |x| ly 
(e) |x| < aif and only if-a<x<a. 
(f) |x| > aif and only if x > aor x < —a. 


24. Take a and 5 to be nonzero integers and prove the given propositions. 
(a) adivides | if and only if a = +1. 
(b) Ifa= +6, then |a| = |b. 


CHAPTER 3 


SET THEORY 


3.1. SETS AND ELEMENTS 


The development of logic that resulted in the work of Chapters 1 and 2 went through 
many stages and benefited from the work of various mathematicians and logicians 
through the centuries. Although modern logic can trace its roots to Descartes with 
his mathesis universalis and Gottfried Leibniz’s De Arte Combinatoria (1666), the 
beginnings of modern symbolic logic is generally attributed to Augustus De Morgan 
[Formal Logic (1847)], George Boole [Mathematical Analysis of Logic (1847) and An 
Investigation of the Laws of Thought (1847)], and Frege [Begriffsschrift (1879), Die 
Grundlagen der Arithmetik (1884), and Grundgesetze der Arithmetik (1893)]. How- 
ever, when it comes to set theory, it was Georg Cantor who, with his first paper, “Ueber 
eine Eigenschaft des Inbegriffs aller reellen algebraischen Zahlen” (1874), and over a 
decade of research, is the founder of the subject. For the next four chapters, Cantor’s 
set theory will be our focus. 

A set is a collection of objects known as elements. An element can be almost any- 
thing, such as numbers, functions, or lines. A set is a single object that can contain 
many elements. Think of it as a box with things inside. The box is the set, and the 
things are the elements. We use uppercase letters to label sets, and elements will usu- 


117 


A First Course in Mathematical Logic and Set Theory, First Edition. Michael L. O’ Leary. 
© 2016 John Wiley & Sons, Inc. Published 2016 by John Wiley & Sons, Inc. 


118 Chapter 3 SET THEORY 


ally be represented by lowercase letters. The symbol € (fashioned after the Greek letter 
epsilon) is used to mean “element of,” so if A is a set and a is an element of A, write 
€aA or, the more standard, a € A. The notation a,b € A means a € A andbé€ A. If 
c is not an element of A, write c ¢ A. If A contains no elements, it is the empty set. It 
is represented by the symbol @. Think of the empty set as a box with no things inside. 


Rosters 


Since the elements are those that distinguish one set from another, one method that 
is used to write a set is to list its elements and surround them with braces. This is 
called the roster method of writing a set, and the list is known as a roster. The braces 
signify that a set has been defined. For example, the set of all integers between | and 
10 inclusive is 

{1,2,3,4,5, 6, 7, 8,9, 10}. 


Read this as “the set containing 1, 2, 3, 4, 5, 6, 7, 8, 9, and 10.” The set of all integers 
between | and 10 exclusive is 


{2, 3, 4,5, 6,7, 8,9}. 


If the roster is too long, use ellipses (...). When there is a pattern to the elements of 
the set, write down enough members so that the pattern is clear. Then use the ellipses 
to represent the continuing pattern. For example, the set of all integers inclusively 
between | and 1,000,000 can be written as 


{1,2,3,...,999,999, 1,000,000}. 


Follow this strategy to write infinite sets as rosters. For instance, the set of even integers 
can be written as 
{...,-4, -2,0,2,4,...}. 


M@ EXAMPLE 3.1.1 


¢ As aroster, { } denotes the empty set. Warning: Never write {@} for the empty 
set. This set has one element in it. 


¢ A set that contains exactly one element is called a singleton. Hence, the sets {1}, 
{f}, and {@} are singletons written in roster form. Also, 1 € {1}, f € {f}, 
and @ € {GO}. 


¢ The set of linear functions that intersect the origin with an integer slope can be 
written as: 
{...,—-2x, —x,0,x,2x,... }. 


(Note: Here 0 represents the function f(x) = 0.) 


Let A and B be sets. These are equal if they contain exactly the same elements. The 
notation for this is A = B. What this means is if any element is in A, it is also in B, and 


Section 3.1 SETS AND ELEMENTS 119 


conversely, if an element is in B, itis in A. To fully understand set equality, consider 
again the analogy between sets and boxes. Suppose that we have a box containing a 
carrot and a rabbit. We could describe it with the phrase the box that contains the 
carrot and the rabbit. Alternately, it could be referred to as the box that contains the 
orange vegetable and the furry, cotton-tailed animal with long ears. Although these 
are different descriptions, they do not refer to different boxes. Similarly, the set {1,3} 
and the set containing the solutions of (x — 1)(x —3) = 0 are equal because they contain 
the same elements. Furthermore, the order in which the elements are listed does not 
matter. The box can just as easily be described as the box with the rabbit and the carrot. 
Likewise, {1,3} = {3,1}. Lastly, suppose that the box is described as containing the 
carrot, the rabbit, and the carrot, forgetting that the carrot had already been mentioned. 
This should not be confusing, for one understands that such mistakes are possible. It is 
similar with sets. A repeated element does not add to the set. Hence, {1,3} = {1,3, 1}. 


Famous Sets 


Although sets can contain many different types of elements, numbers are probably the 
most common for mathematics. For this reason particular important sets of numbers 
have been given their own symbols. 


Symbol Name 

The set of natural numbers 
The set of integers 

The set of rational numbers 
The set of real numbers 

The set of complex numbers 


QAFONZ 


As rosters, 
N = {0,1,2,... } 


and 
Z = {...,-2,-1,0,1,2,... }. 


Notice that we define the set of natural numbers to include zero and do not make a 
distinction between counting numbers and whole numbers. Instead, write 


Aue i Sak: eee 
and 
Z ={...,-3,-2,-1}. 
@ EXAMPLE 3.1.2 
- 10E Z*, but0 ¢ Zt. 
° 4EN, but -5 EN. 
e —5 € Z, but .65 € Z. 


120 Chapter 3 SET THEORY 


¢ 65 € Qand 1/2 €Q, butz ZQ. 
e ER, but3-2iER. 


© 3-21EC. 


Of the sets mentioned above, the real numbers are probably the most familiar. It 
is the set of numbers most frequently used in calculus and is often represented by a 
number line. The line can be subdivided into intervals. Given two endpoints, an 
interval includes all real numbers between the endpoints and possibly the endpoints 
themselves. Interval notation is used to name these sets. A parenthesis next to an 
endpoint means that the endpoint is not included in the set, while a bracket means that 
the endpoint is included. If the endpoints are included, the interval is closed. If they 
are excluded, the interval is open. If one endpoint is included and the other is not, the 
interval is half-open. If the interval has only one endpoint, then the set is called a ray 
and is defined using the infinity symbol (co), with or without the negative sign. 


HM DEFINITION 3.1.3 


Let a, b € R such that a < b. 


closed interval [a, b] closed ray [a, oo) 
open interval (a, b) closed ray (—o, a] 
half-open interval — [a, b) open ray (a, oo) 
half-open interval (a, b] open ray (—co, a) 


@ EXAMPLE 3.1.4 
We can describe (4, 7) as 


the interval of real numbers between 4 and 7 exclusive 
and [4,7] as 
all real numbers between 4 and 7 inclusive. 


There is not a straightforward way to name the half-open intervals. For (4, 7], we 
can try 


the set of all real numbers x such that4 <x <7 
or 


the set of all real numbers greater than 4 and less than or equal to 7. 


The infinity symbol does not represent a real number, so a parenthesis must be used 
with it. Furthermore, the interval (—0oo, co) can be used to denote R. 


Section 3.1 SETS AND ELEMENTS 121 


4 -3 —-2 -1 0 1 2 3 4 4 -3 —2 -1 0 1 2 3 4 


(a) (-1,3] (b) (—00, 2] 


Figure 3.1 A half-open interval and a closed ray. 


M@ EXAMPLE 3.1.5 


The interval (—1, 3] contains all real numbers that are greater than —1 but less 
than or equal to 3 [Figure 3.1(a)]. A common mistake is to equate (—1, 3] with 
{0,1,2,3}. It is important to remember that (—1,3] includes all real numbers 
between —1 and 3. Hence, this set is infinite, as is (—oo, 2). It contains all real 
numbers less than 2 [Figure 3.1(b)]. 


@ EXAMPLE 3.1.6 


Let p(x) := x +2 = 7. Since p(5) and there is no other real number a such that 
p(a), there exists a unique x € R such that p(x). However, if a is an element of 
Z” or (—oo, 5), then ap(a). 


Mi EXAMPLE 3.1.7 


If q(x) := x > 10, then q(x) for all x € [20, 100], there exists x € Q such that 
—q(x), and there is no element a of {1,2,3} such that q(a). 


Abstraction 


When trying to write sets as rosters, we quickly discover issues with the technique. 
Since the rational numbers are defined using integers, we suspect that Q can be written 
as a roster, but when we try to begin a list, such as 


ee 
2 3 3 
we realize that there are complications with the pattern and are not quite sure that the 
we will exhaust them all. When considering R, we know immediately that a roster is 
out of the question. We conclude that we need another method. 
Fix a first-order alphabet with theory symbols S. Let A be a set and S-formula p(x) 
have the property that for every a, 


1 


aE AS f(a). 


Notice that p(x) completely describes the members of A. Namely, whenever we write 
a € A, we can also write p(a), and, conversely, whenever we write p(a), we can also 
write a € A. For example, let E be the set of even integers. As a roster, 


| OO meee me A eco 


122 Chapter 3 SET THEORY 


Let p(x) denote the formula 
dn(n € ZA x = 2n). (3.1) 


The even integers are exactly those numbers x such that p(x). In particular, we have 
p(2) and p(—4) but not p(5). Therefore, 2 and —4 are elements of E, but 5 is not. 


H DEFINITION 3.1.8 


Let A be a first-order alphabet with theory symbols S. Let p(x) be an S-formula 
and A be a set. Write A = {x : p(x)} to mean 


aeEAS f(a). 
Using {x : p(x)} to identify A is called the method of abstraction. 
Using Definition 3.1.8 and (3.1), write E using abstraction as 
E={x:dnn€ZAx=2n)} 


or 
E = {x : p(x)}. 

Read this as “the set of x such that p(x).” Because x = 2n, it is customary to remove x 
from the definition of sets like E and write 

E={2n: €nZ}, 
or 

E={2n:neZ}. 
Read this as “the set of all 2” such that n is an integer.” This simplified notation is still 
considered abstraction. Its form can be summarized as 


{elements : condition}. 


That is, what the elements look like come before the colon, and the condition that must 
be satisfied to be an element of the set comes after the colon. 


M@ EXAMPLE 3.1.9 


Given the quadratic equation x* — x — 2 = 0, we know that its solutions are —1 


and 2. Thus, its solution set is A = {—1,2}. Using the method of abstraction, 
this can be written as 


A={x: —-—-+xxx2=0 A €xR}. 


However, as we have seen, it is customary to write the formula so that it is easier 
to read, so 
A:x?-x-2=0Ax ER}, 


Section 3.1 SETS AND ELEMENTS 123 


or, using a common notation, 
A{x ER: x*-x-2=0}. 


Therefore, given an arbitrary polynomial f(x), its solution set over the real num- 
bers is 


{x ER: f(x) =0}. 


EXAMPLE 3.1.10 


Since x ¢ ©@ is always true, to write the empty set using abstraction, we use a 
formula like x # x or a contradiction like P A —P, where P is a propositional 
form. Then, 


B@={xER:xF¥x}={x: PARP}. 


EXAMPLE 3.1.11 


Using the natural numbers as the starting point, Z and Q can be defined using 
the abstraction method by writing 


Z={n:nENv-neNn} 
and 
Q={f:abeZrrzo}. 


Notice the redundancy in the definition of Q. The fraction 1/2 is named multiple 
times like 2/4 or 9/18, but remember that this does not mean that the numbers 
appear infinitely many times in the set. They appear only once. 


EXAMPLE 3.1.12 
The open intervals can be defined using abstraction. 
(a,b) ={xER:a<x<bD}, 


(a,co) = {x ER: a< x}, 
(—o0,b) = {x ER: x <a}. 
See Exercise 9 for the closed and half-open intervals. Also, as with Zt and Z, 
the superscript + or — is always used to denote the positive or negative numbers, 
respectively, of a set. For example, 


R* = {x ER: x>0} 


and 
R’ ={xE€R:x<O}. 


124 Chapter 3 SET THEORY 


M@ EXAMPLE 3.1.13 


Each of the following are written using the abstraction method and, where ap- 
propriate, as a roster. 


¢ The set of all rational numbers with denominator equal to 3 in roster form 


is 
{ _3 = _l = 1 = 3 \ 
me 3° 3° 3°3°3°3°3°" : 


Using abstraction, it is 
{ Ere z\ 
3° : 


e The set of all linear polynomials with integer coefficients and leading co- 
efficient equal to | is 


{...,x—-2,x-—1,x,x+1,x4+2,...}. 


Using abstraction it is 
{x+n:ne€Z}. 


¢ The set of all polynomials of degree at most 5 can be written as 


{a5x° +++++a,x+a9:a;€RAi=0,1,...,5}. 


Exercises 


1. Determine whether the given propositions are true or false. 
(a) OEN 
(b) 1/2EZ 
(c) -4€Q 
(4) 4+zER 
(e) 4.34534 € C 
(f) {1,2} = {2,1} 
(g) {1,2} = {1,2,1} 
(h) [1,2] = {1,2} 
G@) (1,3) = {2} 
Q) -1 €(-00, -1) 
(k) -1 €[-l, 0) 
dd) @ €(-2,2) 
(m) GED 
(n) OED 


2. Write the given sets as rosters. 
(a) The set of all integers between | and 5 inclusive 
(b) The set of all odd integers 
(c) The set of all nonnegative integers 


Section 3.1 SETS AND ELEMENTS 125 


(d) The set of integers in the interval (—3, 7] 
(e) The set of rational numbers in the interval (0, 1) that can be represented with 
exactly two decimal places 


3. If possible, find an element a in the given set such that a+ 3.14 = 0. 
(a) Z 
(b) R 
(c) R* 


(f) C 
(g) (0,6) 
(h) (00, —1) 


4. Determine whether the following are true or false. 
(a) 1 € {x: p(x)} when 7p(1) 
(bt) 7€ {x ER: x*-5x-14=0} 
(c) 7x? —0.5x € {a)x* + ayx + a9: 4; €Q} 
(d) xy € {2k:k € Z} if x is even and y is odd. 
(e) cosé € {acosé+ bsin#d : a,b € R} 
(f) {1,3} = {x: («— D(x — 3) = 0} 
(g) {1,3} = (x: («-1)@- 3)? =0} 
(h) {[43] :aeR}={[59] :xe@Randy=0} 


5. Write the following given sets in roster form. 
(a) {-3n:neEZ} 
(b) {O-n:neER} 
(c) {ncosx:n€ Z} 
(d) {ax?+ax+a:a€EN} 
(ce) {[7| :nez} 
6. Use a formula to uniquely describe the elements in the following sets. For example, 
x € Nifand only ifx EZAx>0. 
(a) (0,1) 
(b) (—3,3] 
(c) [0, co) 
(d) Zt 
(e) {...,-2,-1,0,1,2,...} 
(f) {2a,4a,6a,...} 


7. Write each set using the method of abstraction. 
(a) All odd integers 
(b) All positive rational numbers 
(c) All integer multiples of 7 
(d) All integers that have a remainder of | when divided by 3 


126 Chapter 3 SET THEORY 


(e) All ordered pairs of real numbers in which the x-coordinate is positive and 
the y-coordinate is negative 

(f) All complex numbers whose real part is 0 

(g) All closed intervals that contain z 

(h) All open intervals that do not contain a rational number 

(i) All closed rays that contain no numbers in (—co, 3] 

Qj) All 2 x 2 matrices with real entries that have a diagonal sum of 0 

(k) All polynomials of degree at most 3 with real coefficients 


8. Write the given sets of real numbers using interval notation. 
(a) The set of real numbers greater than 4 
(b) The set of real numbers between —6 and —5 inclusive 
(c) The set of real numbers x so that x <5 
(d) The set of real numbers x such that 10 < x < 14 


9. Let a,b € R with a < b. Write the given intervals using the abstraction method. 
(a) [a, 5] 
(b) [a, 00) 
(c) (—co, b] 
(d) (a,] 


3.2 SET OPERATIONS 


We now use connectives to define the set operations. These allow us to build new sets 
from given ones. 


Union and Intersection 
The first set operation is defined using the V connective. 
@ DEFINITION 3.2.1 


The union of A and B is 


AUB={x:xE€AVXEB}. 


The union of sets can be viewed as the combination of all elements from the sets. On 
the other hand, the next set operation is defined with A and can be considered as the 
overlap between the given sets. 


HM DEFINITION 3.2.2 


The intersection of A and B is 


ANB={x:xE€AAXE B}. 


Section 3.2 SET OPERATIONS 127 


(a) AUB (b) ANB 


Figure 3.2 Venn diagrams for union and intersection. 


For example, if A = {1,2,3,4} and B = {3,4,5, 6}, then 
AU B= {1,2,3,4,5, 6} 
and 
AN B= {3,4}. 


The operations of union and intersection can be illustrated with pictures called Venn 
diagrams, named after the logician John Venn who used a variation of these drawings 
in his text on symbolic logic (Venn 1894). First, assume that all elements are members 
of a fixed universe U (page 97). In set theory, the universe is considered to be the 
set of all possible elements in a given situation. Use circles to represent sets and a 
rectangle to represent U. The space inside these shapes represent where elements might 
exist. Shading is used to represent where elements might exist after applying some set 
operations. The Venn diagram for union is in Figure 3.2(a) and the one for intersection 
is in Figure 3.2(b). If sets have no elements in their intersection, we can use the next 
definition to name them. 


@ DEFINITION 3.2.3 
The sets A and B are disjoint or mutually exclusive when AN B= ©. 
The sets {1,2,3} and {6,7} are disjoint. A Venn diagram for two disjoint sets is given 
in Figure 3.3. 
Set Difference 


The next two set operations take all of the elements of one set that are not in another. 
They are defined using the not connective. 


Hi DEFINITION 3.2.4 
The set difference of B from A is 


A\ B={x:xE€AAx¢ B}. 


128 Chapter 3 SET THEORY 


U 


Figure 3.3 A Venn diagram for disjoint sets. 


The complement of A is defined as 


A=U\A={x:xE€UAXE A}. 


Read A \ B as “A minus B” or “A without B.” See Figure 3.4(a) for the Venn diagram 
of the set difference of sets and Figure 3.4(b) for the complement of a set. 


M@ EXAMPLE 3.2.5 


The following equalities use set difference. 


»N=Z\Z 


¢ The set of irrational numbers is R \ Q. 


*R=C\{a+bi:abeERAbF0} 


(a) A\ B 


(b) A 


Figure 3.4 Venn diagrams for set difference and complement. 


Mi EXAMPLE 3.2.6 


Section 3.2 SET OPERATIONS 


129 


Let U = {1,2,...,10}. Use a Venn diagram to find the results of the set op- 
erations on A = {1,2,3,4,5} and B = {3,4,5,6,7,8}. Each element will be 
represented as a point and labeled with a number. 


¢ AUB= {1,2,3,4,5,6,7,8) 


- ANB = {3,4,5} 


» A\ B= {1,2} 


« A= {6,7,8,9, 10} 


EXAMPLE 3.2.7 


Let C = (-4,2), D = [-1,3], and U = R. Use the diagram to perform the set 


operations. 


° CUD . 


4 —-3 —2 -1 0 1 2 3 4 
—CnD— 


C4 22529 SAO, Se 2 8 A 
—C\D- 


- CUD =(-4,3] 


* CND=[-1,2) 


- C\D=(-4,-1) 


* C =(-co, —4] 


U [2, oo) 


130 Chapter 3 SET THEORY 


Cartesian Products 


The last set operation is not related to the logic connectives as the others, but it is 
nonetheless very important to mathematics. Let A and B be sets. Given elements 
a € Aand b € B, we call (a, b) an ordered pair. In this context, a and b are called 
coordinates. It is similar to the set {a,b} except that the order matters. The definition 
is due to Kazimierz Kuratowski (1921). 


H DEFINITION 3.2.8 


Ifae Aand be B, 
(a, b) = {{a}, {a, b}}. 


Notice that (a, b) = (a’, b’) means that 
{{a}, {a,b}} = {{a'}, fa’, b’}}, 
which implies that a = a’ and b = b’. Therefore, 
(a, b) = (a’, b’) if and only if a= a’ andb= DB’. 


The set of all ordered pairs with the first coordinate from A and the second from B 
is named after René Descartes. 


HM DEFINITION 3.2.9 


The Cartesian product of A and B is 


AX B= {(a,b):aG ANDE B}. 


The product R* = R x R is the set of ordered pairs of the Cartesian plane. 
@ EXAMPLE 3.2.10 
e Since (1,2) 4 (2, 1), 
{1,2} x {0, 1,2} = {1, 0), C, 1), C, 2), 2, 0), (2, 1), (2, 2)}. 


Even though we have a set definition for an ordered pair, we can still visually 
represent this set on a grid as in Figure 3.5(a). 


¢ If A= {1,2,7} and B = {@, {1,5}}, 
Ax B= {(1, 2), (1, {1,5}), 2, 2), (2, {1,5}), 7, 2), (7, 1, Spt. 
See Figure 3.5(b). 


e For any set A, @ X A= AX © = @ because =Ax(x EC SAVE A). 


Section 3.2 SET OPERATIONS 


C, 2) (2, {1, 5}) 


2+ e e {1, 5} e e e (7, @) 
it e e ® e e e 
0 +—_+— 
1 2 1 2, 7 
(a) {1,2} x {0, 1,2} (b) {1,2,7} x {@, {1,5}} 


Figure 3.5 Two Cartesian products. 


We generalize Definition 3.2.8 by defining 
(a, b,c) = {{a}, {a,b}, {a,b,c}, 

(a, b,c,d) = {{a}, {a,b}, {a,b,c}, {a,b,c,d}}, 

and forn € N, 
(qs 415 +--+ 4n—1) = {{}, {49,41}, +++ + 10.21, «++ An} }, 

which is called an ordered n-tuple. Then, 

AXBXC= {(a,b,c):aEAALDEBAcCEC}, 

AXBXxXCxD={(a,b,c,d):aEANDEBAcCECAd ED}, 


and 
A" = {(d, 4], ---5 4,1) 14; E AAI =0,1,...,n—- 1}. 


Specifically, R? = Rx Rx R, and 
R"=RxRx::--xXR, 
n times 

which is known as Cartesian n-space. 
@ EXAMPLE 3.2.11 

Let A = {1}, B= {2}, C = {3}, and D = {4}. Then, 

Ax BxCxD= {(,2,3,4)} 
is a singleton containing an ordered 4-tuple. Also, 
(A x B) U(C x D) = {(1, 2), 3, 4)}, 


but 
AX(BUC)x D= {(, 2,4), 1, 3,4)}. 


131 


132 Chapter 3 SET THEORY 


Order of Operations 


As with the logical connectives, we need an order to make sense of expressions that 
involve many operations. To do this, we note the association between the set operations 
and certain logical connectives. 


A 
A\B 
ANB A 
AUB Vv 


From this we derive the order for the set operations. 


Hi DEFINITION 3.2.12 [Order of Operations] 


To find a set determined by set operations, read from left to right and use the 
following precedence. 


e sets within parentheses (innermost first), 
e complements, 

e set differences, 

e intersections, 


e unions. 


M@ EXAMPLE 3.2.13 


If the universe is taken to be {1, 2,3, 4,5}, then 
{5} U {1,2} n {2,3} = {5} U (3,4, 5} {2,3} = {5} u {3} = {3,5} 
This set can be written using parentheses as 
{5} U({1,2} 9 {2,3)). 
However, 
({5} U {1,2}) 9 {2,3} = {5} U (3,4, 5}) 9 {2,3} = (3,4, 5} 9 {2,3} = {3}, 
showing that 


{5} U {1,2} n {2,3} # ({5} U {1, 2}) a {2, 3}. 


Section 3.2 SET OPERATIONS 133 


EXAMPLE 3.2.14 
Define A = {1}, B= {2}, C = {3}, and D = {4}. Then, we have 


AUBUCUD = (1,2,3,4}. 
Written with parentheses and brackets, 
AUBUCUD=(((AU B)UC]UD). 


With the given assignments, AN BN CN D is empty. 


Exercises 


1. Each of the given propositions are false. Replace the underlined word with another 
word to make the proposition true. 

(a) Intersection is defined using a disjunction. 

(b) Set diagrams are used to illustrate set operations. 

(c) R\(R \ Q) is the set of irrational numbers. 

(d) Set difference has a higher order of precedence than complements. 

(ce) The complement of A is equal to R set minus A. 

(f) The intersection of two intervals is always an interval. 

(g) The union of two intervals is never an interval. 

(h) Ax Bis equal to @ if B does not contain ordered pairs. 


2. Let A = {0,2,4,6}, B = {3,4,5,6}, C = {0,1,2}, and U = {0,1,...,9, 10}. 
Write the given sets in roster notation. 

(a) AUB 

(b) ANB 

(c) A\ B 

(d) B\A 

(e) A 

(f) AxB 

(g) AUBNC 

(h) ANBUC 

(i) ANB 

(jj) AUANB 

(k) A\ B\C 

(I) AUB\ANC 


3. Write each of the given sets using interval notation. 
(a) [2,3]n(5/2,7] 
(b) (—00, 4) U (—6, ov) 
(c) (—2,4) N[-6, oo) 
(d) @uU(4,12] 


134 Chapter 3 SET THEORY 


4. Identify each of the the given sets. 
(a) [6,17] N[17, 32) 
(b) [6, 17) [17, 32) 
(c) [6,17] U (17,32) 
(d) [6, 17) U (17, 32) 
5. Draw Venn diagrams. 
(a) ANB 
(b) AUB 
(c) ANBUC 
(d) (AUB)\C 
(ec) ANCNB 
(f) A\BnC 


6. Match each Venn diagram to as many sets as possible. 


(A) (B) (C) 


@) 


(D) (E) 


(F) 


(a) AUB 

(b) A\B 

(c) ANB 

(d) (AUB)\ (ANB) 

(ec) ANBUANCUBNC 

(f) ANBUA\B 

(g) (AU B)N(AUB) 

(h) (AU B)NC]U[(AU B)NC] 

(i) [ANBNC]U[ANBNC] 

(j) A\(BUC)N B\(AUC)NC \ (AUB) 
7. The ordered pair (1,2) is paired with the ordered pair (2, 1) using a set operation. 
Write each resulting set as a roster. 

(a) (1,2)U (2,1) 

(b) (1,2)N@, 1) 

(c) (1,2)\@,1) 


8. Let p(x) be a formula. Prove the following. 
(a) Vx[x € AUB => p(x)] > Vx[x € AN B = p(x)] 


Section 3.3 SETS WITHIN SETS 135 


(b) Vx[x € AUB = p(x)] & Vx[x € A > p(x)] A Vx[x € B > p(x)] 
(c) Ax[x € AN BA p(x)] > Ax[x € AA p(x)] A Ax[x € BA p(x)] 
(d) Ax[x € AU BA p(x)] © Ax[x € AA p(x)] V Ax[x € BA p(x)] 


9. Find a formula p(x) and sets A and B to show that the following are false. 
(a) Wx[x € A > p(x)] V Vx[x € B > p(x)] > Vx[x € AUB = p(x)] 
(b) Ax[x € AA p(x)] A Sx[x € BA p(x)] > Ax[x EC AN BA p(x)] 


3.3. SETS WITHIN SETS 


An important relation between any two sets is when one is contained within another. 


Subsets 


Let A and B be sets. A is a subset of B exactly when every element of A is also an 
element of B, in symbols A C B. This is represented in a Venn diagram by the circle 
for A being within the circle for B [Figure 3.6(a)]. 


Hi DEFINITION 3.3.1 
For all sets A and B, 


ACBSVx(XxEA>xe B). 


If A is not a subset of B, write A ¢ B. This is represented in a Venn diagram by A 
overlapping B with a point in A but not within B [Figure 3.6(b)]. Logically, this means 


AZ BS-Wx(x €A>x EB) SAx(x CEAAX EB). 


Thus, to show A g B find an element in A that is not in B. For example, if we let 
A = {1,2,3} and B= {1,2,5}, then A g B because 3 € A but 3 ¢ B. 


(a) ACB (b) AZ B 


Figure 3.6 Venn diagrams for a subset and a nonsubset. 


136 Chapter 3 SET THEORY 


EXAMPLE 3.3.2 


The proposition for all sets A and B, AU B C A is false. To see this, we must 
prove that there exists A and B such that AU B Z A. We take A = {1} and 
B = {2} as our candidates. Since AUB = {1,2} and2 ¢ A, wehave AUB ¢ A. 


Every set is the improper subset of itself. The notation A C B means A C B but 
A # B. In this case, A is a proper subset of B. 


M@ EXAMPLE 3.3.3 
e {1,2,3} c {1,2,3,4,5} and {1,2,3} ¢ {1, 2,3} 
e-NcZcQcRCC 
° (4,5) c 4,5] c [4,5] 
e {1,2,3} is not a subset of {1,2,4}. 


Our study of subsets begins with a fundamental theorem. It states that the empty set 
is a subset of every set. 


M@ THEOREM 3.3.4 
OCA. 


PROOF 
Let A be a set. Since x € @, we have 


Vx(xE€ODSVXEADSVXXED xEASHGCALH 


Proving that one set is a subset of another always means proving an implication, so 
Direct Proof is the primary tool in these proofs. That is, usually to prove that a set A is 
a subset of a set B, take an element x € A and show x € B. 


@ EXAMPLE 3.3.5 
¢ Letx € Z\N. Then, x € Z but x € N, so x € Z by Simp. Thus, Z\ N C Z. 


e Let 
A={x:dn €Z[(x — 2n)(x — 2n — 2) = O}} 


and 
B={2n:neEZ}. 


Take x € A. This means that 
dn € Z[(x — 2n)(x —2n—2)=0]. 


In other words, 
(x — 2n)(x — 2n—- 2) =0 


Section 3.3 SETS WITHIN SETS 137 


for some n € Z. Hence, 
x—-2n=Oorx-—2n-2=0. 


We have two cases to check. If x — 2n = 0, then x = 2n. This means x € B. If 
x—2n—2 =0, then x = 2n+ 2 = 2(n+ 1), which also means x € B sincen+ 1 
is an integer. Hence, A C B. 


e Fixa,b € Z. Let x = na, some n € Z. This means x = na + 06, which implies 
that {na:n€ Z} C {na+mb:n,meZ}. 


This next result is based only on the definitions and Inference Rules 1.2.10. As with 
Section 3.2, we see the close ties between set theory and logic. 


@ THEOREM 3.3.6 
° ACA. 
eIfAC Band BCC,then ACC. 
e IfA C Bandx € A, thenx € B. 
e IfA C Bandx € B,thenx € A. 
eLetAC BandC CD. Ifx€ AorxE€C,thenx € Borx Ee D. 


e LettAC BandC CD. Ifx € Borx € D,thenx € Aorx€C. 


PROOF 
Assume A C B and B C C. By definition, x € A implies x € B andx € B 
implies x € C. Therefore, by HS, if x € A, then x € C. In other words, A C C. 
The remaining parts are left to Exercise 4. Hi 


Note that it is not the case that for all sets A and B, A C B implies that B C A. To 
prove this, choose A = @ and B = {0}. Then, A C B yet B ¢ A because 0 € B and 
O€ 3S. 


Equality 


For two sets to be equal, they must contain exactly the same elements (page 118 of 
Section 3.1). This can be stated more precisely using the idea of a subset. 


Hi DEFINITION 3.3.7 
A= Bmeans AC Band BCA. 


To prove that two sets are equal, we show both inclusions. Let us do this to prove 
AU B=ANB. This is one of De Morgan’s laws. To prove it, we demonstrate both 


AUBCANB 


138 Chapter 3 SET THEORY 


and 


ANBCAUB. 


This amounts to proving a biconditional, which means we will use the rule of bicondi- 
tional proof. Look at the first direction: 


xE€AUB 
a(x € AU B) 
a(x € AVx € B) 
xEAAxXEB 
xE€AAXEB 
xE€ANB 


Given 

Definition of complement 
Definition of union 

De Morgan’s law 
Definition of complement 
Definition of intersection 


Now read backward through those steps. Each follows logically when read in this order 
because only definitions and replacement rules were used. This means that the steps 
are reversible. Hence, we have a series of biconditionals, and we can use the short rule 


of biconditional proof (page 108): 


xE€AUB 


Hence, AU B= ANB. 


© -(x € AU B) 
©&-(x € AVxE B) 
SxEAAxXEB 
exEAAxEeB 


SxeEAnB. 


We must be careful when writing these types of proofs since it is easy to confuse 


the notation. 


e The correct translation for x € AN Bis x € AA x € B. Common mistakes for 
this translation include using formulas with set operations: 


xEANXEB 
a A 


Incorrect. A set should be on 
either side of a set operation. 


..and sets with connectives: 


xEAAB 
4 4 
Correct Pigoreck A formula should 


be present here. 


Section 3.3 SETS WITHIN SETS 139 


Remember that connectives connect formulas and set operations connect sets. 
e Negations also pose problems. If a complement is used, first translate using 
xEASXEUAXEASXEA 


and then proceed with the proof. Similarly, the formula x ¢ AUB can be written 
as neither x € AUx € Bnorx € AV x ¢ B. Instead, use DeM: 


xZFAUBS7AxE AUB) 
&-A(xE€ AVxE B) 
SxEAAxXEB 
oxeEAnB. 
We can now prove many basic properties about set operations. Notice how the fol- 
lowing are closely related to their corresponding replacement rules (1.3.9). 


H@ THEOREM 3.3.8 


e Associative Laws 
ANBNC=AN(BNC) 
AUBUC=AU(BUC) 


¢ Commutative Laws 


ANB=BNA 
AUB=BUA 

¢ De Morgan’s Laws 
AUB=ANB 
ANB=AUB 


Distributive Laws 
AN(BUC)=ANBUANC 
AUBNC=(AUB)N(AUC) 


Idempotent Laws 
ANA=A 
AUA=A. 


M@ EXAMPLE 3.3.9 


We use the short rule of biconditional proof (page 108) to prove the equality 
ANBNC=AN(BNC). 
xEANBNCSXEANBAXEC 
SxEAAXEBAXEC 
Sx EAA(XEBAXEC) 
SxEANA(XE BNC) 
exEAn(BnC). 


140 Chapter 3 SET THEORY 


Another way to prove it is to use a chain of equal signs. 


ANBNC={x:xE€ANBAxEC} 
={x:xE€AAxEBAxEC} 
={x:xE€AAXEBAXEOC)} 
={x:xE€AAxE BNC} 
=AN(BNC). 


We have to be careful when proving equality. If two sets are equal, there are always 
proofs for both inclusions. However, the steps needed for the one implication might 
not simply be the steps for the converse in reverse. The next example illustrates this. 
It is always true that AM B C A. However, the premise is needed to show the other 
inclusion. 


M EXAMPLE 3.3.10 
Let A C B. Prove AN B= A. 


e Letx € AN B. This means that x € A and x € B. Then, x € A (Simp). 


e Assume that x is an element of A. Since A C B, x is also an element of B. 
Hence, x € AN B. 


A more involved example of this uses the concept of divisibility. Let a,b € Z, not 
both equal to zero. A common divisor of a and b is c when c | a andc | b. For 
example, 4 is a common divisor of 48 and 36, but it is not the largest. 


@ DEFINITION 3.3.11 


Let a,b € Z with a and 5 not both zero. The integer g is the greatest common 
divisor of a and bif g is acommon divisor of a and band e < g for every common 
divisor e € Z. In this case, write g = gcd(a, b). 


For example, 
12 = gcd(48, 36) 


and 
7 = gcd(0, 7). 


Notice that it is important that at least one of the integers is not zero. The gcd(0, 0) is 
undefined since all a such that a # 0 divide 0. Further notice that the greatest common 
divisor is positive. 


M@ EXAMPLE 3.3.12 


Let a, b € Z. Prove for all n € Z, 


gcd(a + nb, b) = gcd(a, b). 


Section 3.3 SETS WITHIN SETS 141 


If both pairs of numbers have the same common divisors, their greatest common 
divisors must be equal. So, define 


S={keEZ:k|at+nbak |b} 
and 
T={keEZ:kl|ank| bd}. 
To show that the greatest common divisors are equal, prove S = T. 


e« Letd € S. Thend |a+nbandd | b. This means a+nb = dl and b = dk 
for some /,k € Z. We are left to show d | a. By substitution, a+ndk = dl. 
Hence, d € T because 


a= dl —ndk = d(I —nk). 
e Now take d € T. This means d | a and d | b. Thus, there exists /,k € Z 
such that a = d/ and b = dk. Then, 
atnb=dl+ndk = d(l+nk). 
Therefore, d|a+nbanddeS. 


As with subsets, let us now prove some results concerning the empty set and the 
universe. We use two strategies. 


e Let A be a set. We know that 
A = © if and only if Vx(x € A). 


Therefore, to prove that A is empty, take an arbitrary a and show a ¢ A. This can 
sometimes be done directly, but more often an indirect proof is better. That is, 
assume a € A and derive a contradiction. Since the contradiction arose simply 
by assuming a € A, this formula must be the problem. Hence, A can have no 
elements. 


¢ Let U be a universe. To prove A = U, we must show that A C U and U C A. 
The first subset relation is always true. To prove the second, take an arbitrary el- 
ement and show that it belongs to A. This works because U contains all possible 
elements. 


M@ EXAMPLE 3.3.13 
¢ Suppose x € AN @. Then x € SW, which is impossible. Therefore, AN @ = ©. 


¢ Certainly, A C AU ©, so to show the opposite inclusion take x € AU ©. Since 
x € ©, it must be the case that x € A. Thus, AU @ C A, and we have that 
AUS=A. 


e From Exercise 5(b), we know ANU C A, so let x € A. This means that x must 
also belong to the universe. Hence,x € ANU,so ANU =A. 


e Certainly, AUU CU. Moreover, by Exercise 5(c), we have the other inclusion. 
Thus, AUU =U. 


142 Chapter 3 SET THEORY 


EXAMPLE 3.3.14 


To prove that a set A is not equal to the empty set, show ~Vx(x ¢ A), but this is 
equivalent to 4x(x € A). For instance, let 


A={xER:x?4+6x+5=0}. 


We know that A is nonempty since —1 € A. 


M@ EXAMPLE 3.3.15 


Let 
A= {(a,b)E RXR: a+b=0} 


and 
B={(0,b):b5ER}. 


These sets are not disjoint since (0, 0) is an element of both A and B. However, 
A # B because (1, —1) € A but (1,-1) € B. 


Let us combine the two strategies to show a relationship between © and U. 


M@ THEOREM 3.3.16 


For all sets A and B and universe U, the following are equivalent. 


ACB 

- AUB=U 

» ANB=@Z. 
PROOF 


e Assume A C B. Suppose x ¢ A. Then, x € A, which implies that x € B. 
Hence, for every element x, we have that x € A or x € B, and we conclude that 
AUB=U. 


¢ Suppose AU B = U. In order to obtain a contradiction, take x € AN B. 
Then, x € B. Since x € A, the supposition also gives x € B, a contradiction. 
Therefore, AN B= ©. 


« Let AN B = @. Assume x € A. By hypothesis, x cannot be a member of B, 
otherwise the intersection would be nonempty. Hence, x € B. 


The following theorem is a generalization of the corresponding result concerning 
subsets. The proof could have been written using the short rule of biconditional proof 
or by appealing to Lemma 3.3.6 (Exercise 21). 


H THEOREM 3.3.17 
If A= Band B=C, then A=C. 


Section 3.3 SETS WITHIN SETS 143 


PROOF 
Assume A = Band B =C. This means A C B, BC A, BC C,andC C B. 
We must show that A = C. 


e Let x € A. By hypothesis, x is then an element of B, which implies that 
xEC. 


e Let x EC. Then, x € B, from which x € A follows. Hf 


The last result of the section involves the Cartesian product. The first part is illus- 
trated in Figure 3.7. The sets B and C are illustrated along the vertical axis and A is 
illustrated along the horizontal axis. The Cartesian products are represented as boxes. 
The other parts of the theorem can be similarly visualized. 

H@ THEOREM 3.3.18 


© AX(BNC)=(Ax B)N(AXC). 
e AX(BUC)=(Ax B)U(AXC). 
© AX(B\C)=(AX B)\(AxC). 


e (Ax B)N(C X D)=(ANC) X (BND). 


PROOF 
We prove the first equation. The last three are left to Exercise 19. Take three sets 
A, B, and C. Then, 


(a,b)E AX(BNC) Sac AADEBNC 
SacCAADEBAAEAANSDEC 
© (a,b)E AX BA(a,bEAXC 
© (a,b) E (AX B)N(AXC). Hf 


| > nen AXB 

B 

| i weiodes Ax (BNC) 
Cc 
| > peer AXC 


_—— ns 


Figure 3.7. Ax (BNC) =(AxX B)N(AXC). 


144 Chapter 3 SET THEORY 


Inspired by the last result, we might try proving that (A x B) U(C x D) is always 
equal to (A UC) x (BU D), but no such proof exists. To show this, take A = B = {1} 
and C = D = {2}. Then 


but 


(AUC)x (BUD) = {1,2} x {1,2} = {d, D, , 2), 2, D, (2, 2)}. 


(Ax B)U(Cx D)= {0 D} UV {@,2)} = {0 D, 2, 2)}, 


Hence, (A UC) x (BU D) g (A x B) U(C x D). Notice, however, that the opposite 
inclusion is always true (Exercise 3.3.8). 


Exercises 


1. Answer true or false. 


(a) 
(b) 


GED 
@ CE {I} 
1EZ 
1cZ 
leg 
{i} oo 
0ED 
{l} EZ 
@C@ 


2. Answer true or false. For each false proposition, find one element that is in the first 
set but is not in the second. 


(a) 
(b) 


3. Prove. 


(a) 
(b) 


TEC 
QtcZzt 
Q\RCZ 
R\QCZ 
Zn(-1,) ¢Q 
(0, 1) ¢ Qt 
(O;.1) S00;1,.2} 
(0,1) € (0, 1] 


{x ER: x2-3x+2=0} CN 
(0,1) € [0, 1] 

[0, 1] Z (0, 1) 

ZxZZZxN 
O,DNAQE[0,1NZ 

RCC 

{bi:bER} CC 

CER 


Section 3.3 SETS WITHIN SETS 145 


4. Prove the remaining parts of Theorem 3.3.6. 


5. Prove. 
(a) ACU 
(b) ANBCA 
(c) AC AUB 
(d) A\BCA 
(ce) If AC B, then AUC C BUC. 
(f) If AC B, then ANC C BNC. 
(g) If AC B,thenC\BCC\A. 
(h) IfA #2, then A A. 
(i) If AC B,then BCA. 
G) IfACB,then {1} x AC {1} xB. 
(k) If AC Cand BC D, then Ax B 
(l) If AG Cand BC D, thenC x D 


Cx D. 
AX B. 


€ 
€ 


6. Prove that A C B if and only if BCA. 


7. Show that the given proposition is false: 


for all sets A and B, if AN BF ©, then A £ ANB. 


8. Prove: (A x B) U(C xX D) C(AUC) xX (BUD). 
9. Take a,b,c EN. Let A= {nE€N:n|a}andC={nEN:n|c}. Supposea |b 
and b|c. Prove ACC. 


10. Prove. 
(a) BC AandC C Aifandonlyif BUC CA. 
(b) AC Band A C Cif and only if AC BNC. 


11. Prove the unproven parts of Theorem 3.3.8. 


12. Prove each equality. 
(a) @=U 
(b) 


G) AUBNB=2 
(kk) ANB\A=2 


146 Chapter 3 SET THEORY 


13. Sketch a Venn diagram for each problem and then write a proof. 
(a) A=ANBUANB 
(bt) AUB=AUANB 
(c) A\(B\C)=An(BUC) 
(d) A\(ANB)=A\B 
(ec) ANBUANBUANB=AUB 
(f) (AUB)\C=A\CUB\C 
(g) A\(B\C)=A\BUANC 
(hy) A\ B\C=A\(BUC) 


14. Prove. 
(a) If AC B,thenA\ B=@. 
(b) If AC @, then A = ©. 
(c) Let U be a universe. If U C A, then A=U. 
(d) If AC B, then B\(B\ A)=A. 
(e) AX B= Gif and only if A = @ or 


15. Leta,c,m € Zand define A = {a+mk:k € Z} and B= {a+m(c+k):k €Z}. 
Show A = B. 


16. Prove A = B, where 


A={[6°| :a+b=0Aa,bER} 


and 
B={[25] :a=-dab +c? =0Aa,b,c,d ER}. 
17. Prove. 
(a) Q#Z 
(b) C#R 


(c) {O}xZ4Z 
(d) RXxZ#ZxR 
(e) If A= {ax? +b: a,b € R} and B= {x°+b:b ER}, then A F B. 
(f) If A= {ax?+b: a,b € Z} and B= {ax? +b: a,b €C}, then AF B. 
(g) A+B, where 
A={[2°] :a,beER} 
and 
B={[25] :a,b,c,d ER}. 


18. Find A and B to illustrate the given inequalities. 
(a) A\B#B\A 
(b) (Ax B)xC#AX(BXC) 
(c) AXBABXxA 


19. For the remaining parts of Theorem 3.3.18, draw diagrams as in Figure 3.7 and 
prove the results. 


Section 3.3 SETS WITHIN SETS 147 


20. Is it possible for A = A? Explain. 


21. Prove Theorem 3.3.17 by first using the short rule of biconditional proof and then 
by directly appealing to Theorem 3.3.6. 


22. Prove that the following are equivalent. 


°° ACB 

° AUB=B 
© A\B=f 
© ANB=A 

23. Prove that the following are equivalent. 

© ANB=6 
- A\B=G 
- ACB 


24. Find an example of sets A, B, and C such that AN B= ANC but BEC. 
25. Does AU B= AUC imply B = C for all sets A and B? Explain. 


26. Prove. 
(a) If AUBCANB, then A= B. 
(b) IfANB=AnNCandAUB=AUC, then B=C. 


27. Prove that there is a cancellation law with the Cartesian product. Namely, if A 4 @ 
and Ax B=AxXC,then B=C. 


28. When does A Xx B = C x D imply that A = C and B = D? 


29. Prove. 
(a) IfAUB#9,thnA4AGDorBF EO. 
(b) IfANB#2@,thenA4 SGandBF# EG. 


30. Find the greatest common divisors of each pair. 
(a) 12 and 18 
(b) 3 and9 
(c) 14 and 0 
(d) 7 and 15 


31. Let a be a positive integer. Find the following and prove the result. 
(a) gced(a,a+ 1) 
(b) gcd(a, 2a) 
(c) ged(a,a*) 
(d) ged(a, 0) 


148 Chapter 3 SET THEORY 


3.4 FAMILIES OF SETS 


The elements of a set can be sets themselves. We call such a collection a family of sets 
and often use capital script letters to name them. For example, let 


@ = {{1,2,3}, {2,3,4}, {3,4,5}}. (3.2) 
The set @ has three elements: {1, 2,3}, {2,3,4}, and {3,4,5}. 
M@ EXAMPLE 3.4.1 
e {1,2,3} € {{1,2,3}, {1,4,9}} 
e 1¢ {{1,2,3}, {1,4,9}} 
© {1,2,3} Z {{1, 2, 3}, (1,4, 9}} 
© {{1,2,3}} ¢ ({1,2, 3}, {1.4,9}}. 
M@ EXAMPLE 3.4.2 
« @ C {G,{S@}} by Theorem 3.3.4. 
« {2} C {S,{G}} because S@ € {B,{D}}. 
e {{G}} C {S, {B}} because {SJ} € {G,{D}}. 
Families of sets can have infinitely many elements. For example, let 
F =({[n,n+ 1] :n€ Z}. (3.3) 
In roster notation, 
F ={...,[-2,-1],[-1,0],[0, 1],[1,2],...}. 


Notice that in this case, abstraction is more convenient. For each integer n, the closed 
interval [n,n + 1] isin F. The set Z plays the role of an index set, a set whose only 
purpose is to enumerate the elements of the family. Each element of an index set is 
called an index. If we let J = Z and A; = [i, i + 1], the family can be written as 


F ={A,:i€ I}. 


To write the family & (3.2) using an index set, let J = {0, 1,2} and define 


Ay = {1,2,3}, 
A, = {2,3,4}, 
As = 43,45). 


Then, the family illustrated in Figure 3.8 is 


@ ={A,:i © 1} = { Aj, A>, A}. 


Section 3.4 FAMILIES OF SETS 149 


Figure 3.8 The family of sets € = {A,; : i € I} with J = {1,2,3}. 


There is no reason why J must be {0, 1,2}. Any three-element set will do. The order 
in which the sets are defined is also irrelevant. For instance, we could have defined 
IT = {w,z,99} and 


Ay = {3,4, 5}, 
A, = {2, 3,4}, 
Agg = {1, 2,3}. 


The goal is to have each set in the family referenced or indexed by at least one element 
of the set. We will still have € = {A, : i € I} with a similar diagram (Figure 3.9). 


M@ EXAMPLE 3.4.3 


Write F (3.3) using N as the index set instead of Z. Use the even natural num- 
bers to index the intervals with a nonnegative integer left-hand endpoint. The 
odd natural numbers will index the intervals with a negative integer left-hand 
endpoint. To do this, define the sets B; as follows: 


Figure 3.9 The family of sets @ = {A,; : i € I} with I = {w, 7,99}. 


150 Chapter 3 SET THEORY 


Use 2n + 1 to represent the odd natural numbers and 2n to represent the even 
natural numbers (” € N). Then, 


Byg)41 = [-2 -1,-2] 
Byayy) =[-1-1,-1] 
By 1 = [-0 - 1, -0] 


and 


By) => (1, 1 of 1] 
By) = [2,2 + 1] 


Therefore, define for all natural numbers n, 
By,4, =[-n-1,-n] 
and 
Bo, =[n,n+ 1). 


We have indexed the elements of ¥ as 


Bo = [0, 1] 

B, =[-1,0] 
B, = [1,2] 
B, = [-2,-1] 
B, = [2,3] 
B; = [-3, -2] 


So under this definition, ¥ = {B; : i € N}. 


Section 3.4 FAMILIES OF SETS 151 


Power Set 


There is a natural way to form a family of sets. Take a set A. The collection of all 
subsets of A is called the power set of A. It is represented by P(A). 


@ DEFINITION 3.4.4 
For any set A, 
P(A) = {B: BCA}. 
Notice that @ € P(A) by Theorem 3.3.4. 


@ EXAMPLE 3.4.5 
° P({1,2,3}) = {S, {1}, {2}, {3}, {1,2}, {1, 3}, {2,3}, (1,2, 3}} 
° PUD, {SI})) = (9, {S}, {Sh}, 19, {O}}} 
¢ PIN) = {2, {0}, {1},..., (0, 1}, {0,2}, ...}. 
Consider A = {2,6} and B = {2,6,10},so A C B. Examining the power sets of each, 


we find that 
P(A) = {2, {2}, {6}, {2, 6}} 


and 
P(B) = {2, {2}, {6}, {10}, {2, 6}, {2, 10}, {6, 10}, {2, 6, 10}}. 


Hence, P(A) € P(B). This result is generalized in the next lemma. 
@ LEMMA 3.4.6 
A C Bif and only if P(A) C P(B). 


PROOF 
e Let A C B. Assume X € P(A). Then, X C A, which gives X C B by 
Theorem 3.3.6. Hence, X € P(B). 


e Assume P(A) € P(B). Let x € A. In other words, {x} C A, but this means that 
{x} © P(A). Hence, {x} € P(B) by hypothesis, sox € B. 


The definition of set equality and Lemma 3.4.6 are used to prove the next theorem. Its 
proof is left to Exercise 9. 


@ THEOREM 3.4.7 
A = B if and only if P(A) = P(B). 


Union and Intersection 


We now generalize the set operations. Define the union of a family of sets to be the set 
of all elements that belong to some member of the family. This union is denoted by the 
same notation as in Definition 3.2.1 and can be defined using the abstraction method. 


152 Chapter 3 SET THEORY 


H DEFINITION 3.4.8 
Let F be a family of sets. Define 
Us ={x:JA(AEC FAXEA)}. 


If the family is indexed so that ¥ = {A, :i € I}, define 
U4: = {x: FG ET Ax € Aj}. 
i€l 


Observe that |) ¥ =U [Exercise 16(a)]. 


ier A 


We generalize the notion of intersection similarly. The intersection of a family of 
sets is the set of all elements that belong to each member of the family. 


@ DEFINITION 3.4.9 
Let ¥ be a family of sets. 
()F = {x:VA(A € F¥ 3x EAD}. 
If the family is indexed so that ¥ = {A, :i € I}, define 


()4:= {x: Wii € I > x € A)}. 
iel 


Observe that (1) F =) [Exercise 16(b)]. 


ier A 


Furthermore, notice that both definitions are indeed generalizations of the operations 
of Section 3.2 because as noted in Exercise 17, 


{A,B} = AUB, 
and 
( {A,B} = ANB. 
M EXAMPLE 3.4.10 
Define @ = {[n,n +1] :n€ Z}. 


e When all of these intervals are combined, the result is the real line. This 


means that 
User. 


e There is not one element that is common to all of the intervals. Hence, 
()\e =e. 


The next example illustrates how to write the union or intersection of a family of sets 
as a roster. 


Section 3.4 FAMILIES OF SETS 153 


M@ EXAMPLE 3.4.11 
Let ¥ = {{1,2,3}, {2,3,4}, {3,4,5}}. 


¢ Since | is in the first set of F, 1 € (J F. The others can be explained 
similarly, so 


LJ # = {1,2,3,4,5}. 


Notice that mechanically this amounts to removing the braces around the 
sets of the family and setting the union to the resulting set: 


Remove braces 


y 
(1,2, 3, 2,3, 4, 3, 4, 5} 


¥ 
{1, 2, 3,4, 5} qe UF 


e The generalized intersection is simply the overlap of all of the sets. Hence, 
3 is the only element of (] F. That is, 


(1% = 3). 


and this is illustrated by the following diagram: 


Mi EXAMPLE 3.4.12 


Since a family of sets can be empty, we must be able to take the union and inter- 
section of the empty set. To prove that |) @ = @, take x € (J) @. This means 
that there exists A € @ such that x € A, but this is impossible. We leave the fact 
that () @ is equal to the universe to Exercise 24. 


The next theorem generalizes the Distributive Laws. Exercise 2.4.10 plays an im- 
portant role in its proof. It allows us to move the quantifier. 


154 Chapter 3 SET THEORY 


THEOREM 3.4.13 
Let A; (i € I) and B be sets. 


» BU{])4,;=( (BUA) 


iel ie] 
. BnUJA, = Jn 4p. 
iel ie] 


PROOF 
We leave the second part to Exercise 19. The first part is demonstrated by the 
following biconditional proof: 


xEBuf{)A,exEBvxel)A, 
ie] ie] 

exe Bvviiel>xe€A;) 
oxeBvviig IVx €A,) 
eVi(ixEe BViIE I Vx EA) 
eVvii¢g IVx € BVx €A)) 
eViiEeIT>oxEe BVxEA;) 
eViiie I> x € BUA;) 
exe |(BUA)./ 


iel 


To understand the next result, consider the following example. Choose the universe 
to be {1,2,3,4,5} and perform some set operations on the family 


{{1,2,3}, {2,3,4}, {2,3,5}}. 
First, 
()\{0.2,3}, {2,3,4},{2,3,5}} = {2,3} = {1,4,5}, 
and second, 


a. 2,33, {2, 3,4}, {2,3,5}} = Lt4.5), 1.5), 1143) = {1,4,5}. 


This leads us to the next generalization of De Morgan’s laws. Its proof is left to Exer- 
cise 23. 


M@ THEOREM 3.4.14 
Let {A; : i € I} be a family of sets. 


-a=UR 


iel iel 


Ua-(m 


iel iel 


Section 3.4 FAMILIES OF SETS 155 


Disjoint and Pairwise Disjoint 


What it means for two sets to be disjoint was defined in Section 3.2 (Definition 3.2.3). 
The next definition generalizes that notion to families of sets. Because a family can 
have more than two elements, it is appropriate to expand the concept of disjointness. 


@ DEFINITION 3.4.15 
Let F be a family of sets. 
« F is disjoint when () F = 2. 
¢ F is pairwise disjoint when for all A, B € F,if A# B, then ANB=@. 


Observe that { {1,2}, {3,4}, {5,6}} is both disjoint and pairwise disjoint because its 
elements have no common members. 


M@ EXAMPLE 3.4.16 


Let A be a set. We see that 
UP = {x : JB[B € P(A) Ax € BI} =A. 


Although P(A) is not a pairwise disjoint family of sets, it is a disjoint because 
@ EPA). 


If the family is indexed, we can use another test to determine if it is pairwise disjoint. 
Let ¥ = {A,; :i € I} bea family of sets. If for all i, 7 € J, 


i # j implies A; N Aj = ©, (3.4) 


then F is pairwise disjoint. To prove this, let ¥ be a family that satisfies (3.4). Take 
A;, A; € F for some i,j € I and assume A; # Aj. Therefore, i # j, for otherwise 
they would be the equal. Hence, A; N A; = @ by Condition 3.4. 

The next result illustrates the relationship between these two terms. One must be 
careful, though. The converse is false (Exercise 14). 


@ THEOREM 3.4.17 
Let ¥ be a family of sets with at least two elements. If F is pairwise disjoint, 


then F is disjoint. 


PROOF 
Assume F is pairwise disjoint. Since ¥ contains at least two sets, let A, BE F 
such that A # B. Then, using Exercise 27, al FCANB=C2.0 


Exercises 


1. Let F = {1,2,3,4,5}, Ay = {1,2}, A, = {3,4}, Az = {1,4}, Ag = {3,4}, and 
A; = {1,3}. Write the given families of sets as a rosters. 


156 Chapter 3 SET THEORY 


(a) {A,;:i€T} 
(b) {A,: i € {2,4}} 
(c) {A,;:i= 1} 
(d) {A;:i=1,2} 
(e) {A,:1€ B} 
(f) {A;:i€ As} 
2. Answer true or false. 
(a) 1e {{1}, {2}, {1,2}} 
(b) {1} © {{1}, {2}, (1, 2}} 
(c) {1} ¢ {{1}, {2}, {1,23} 
(d) {1,2} © {{{1,2}, {3,4}}, {1,2}} 
(e) {1,2} € {{{1,2}, {3,43}, {1,2}} 
(@) {3,4} € ((11,2},13,4}},44 23} 
(g) {3,4} © {{{1,2}, {3,43}, (1, 23} 
(hy) Oe {{(1,2),43;41),(4,245 
(i) OC {{{1,2}, {3,4}}, {1,2}} 
G) {9} €{9,{S, {S}}} 
(k) {2} € {P,{P, {P}}} 
() DE {G,{S, {S}}} 
(m) BC {P,{P, {P}}} 
(n) {2} CD 
(0) {2} C{P,{P}} 
(p) {9} C {{P, {P}}} 
(q) {{S}} © {9,{S}} 
(r) {1} e P(Z) 
(s) {1} CP(Z) 
(t) @EP(C) 
(u) {2} € P(2) 
(v) {9} © P(S) 


3. Find sets such that! N J = @ but {Aj : i € J} and {A; : j € J} are not disjoint. 
4. Let {A; : i © K} be a family of sets and let J and J be subsets of K. Define 


6 = {A;:i € I} and F = {A; : j € J} and prove the following. 
(a) Iff CJ,then@ CF. 
(b) GUFH={A,;:iE€ TUT} 
(c) {A;:iEINJSCENF 


5. Using the same notation as in the previous problem, find a family { A; : i © K} and 


subsets J and J of K such that: 
(a) ENF £{A;:iE€InJ} 
(b) {A,:i€I\J} LE\F 


6. Show {A;:i€ I} = @ if and only if I] = ©. 


Section 3.4 FAMILIES OF SETS 157 


7. Let A bea finite set. How many elements are in P(A) if A has n elements? Explain. 


8. Find the given power sets. 

(a) P({1,2}) 

(b) PC{1,2})) 

(c) P() 

(d) PPS) 

(e) PU{S}}) 

(f) PUD, {PS}, {P,{S}}}) 
9. Prove Theorem 3.4.7. 


10. For each of the given equalities, prove or show false. If one is false, prove any true 
inclusion. 

(a) P(AU B) = P(A) UP(B) 

(b) P(AN B) = P(A) N P(B) 

(c) P(A\ B)= P(A) \ P(B) 

(d) P(A x B) = P(A) x P(B) 


11. Prove P(A) € P(B) implies A C B by using the fact that A € P(A). 


12. Write the following sets in roster form. 
(a) Uf{{1, 2}, {1,2}, (1 3}, 1.43} 
(b) (4L2)}, £6 23,13) 1409 
(c) (| P@) 
(d) UP) 
(e) UUU{1}}, (023), (0 33} HE 4} 
() UNO} (0, 233. (0. 333, 4} 
(g) (LUCCC1}, (11, 23}, (11, 333 (0, 4333 
Ch) (PCC), (1, 23}, (11, 33 (0, 433 
@ UUs 
@ NUS 
13. Draw Venn diagrams for a disjoint family of sets that is not pairwise disjoint and 
for a pairwise disjoint family of sets. 


14. Show by example that a disjoint family of sets might not be pairwise disjoint. 


15. Given a family of sets {A; : i € I}, find a family @ = {B; : i © I} such that 
{A; X B; : i € I} is pairwise disjoint. 


16. Let ¥ = {A; : i € I} be a family of sets. Prove the given equations. 
@) UF =UierAi 
(o) VF =Mier Ai 
17. Prove for any sets A and B, |J{A, B} = AU Band (\{A, B} = ANB. 
18. Let @ = {(0,n): 1 € Z*}. Prove that [J @ = (0, 00) and () @ = (0, 1). 


158 Chapter 3 SET THEORY 


19. Prove the second part of Theorem 3.4.13. 
20. Prove Theorem 3.4.17 indirectly. 


21. Is Theorem 3.4.17 still true if the family of sets ¥ has at most one element? Ex- 
plain. 
22. Let {A; : i € I} be a family of sets and prove the following. 

(a) If BC A; for somei € I, then B C U,,; Aj. 

(b) If A; C B foralli € I, ();<, Aj € B. 

(c) If BC(),-, A; then B C A; for alli € I. 


23. Prove Theorem 3.4.14. 


iel 


24. Show ()@ =U where U is a universe. 
25. Let F be a family of sets such that @ € F. Prove () F =2. 


26. Find families of sets @ and ¥ so that J € = UF but & # F. Can this be 
repeated by replacing union with intersection? 


27. Let F be a family of sets, and let AE F. Prove(}F CACUF. 


28. Let & and F be families of sets. Show the following. 
(@) UF} =F 
() MF} =F 
(c) FE CF, then JS 
(d) If @ C F, then (| F 
) USUA=USUUF 
(f) [\(EUAF)=EnNF 

29. Find families of sets & and F that make the following false. 


(a) (WENFL=NEnNF 
(b+) USN F)=USNUF 


F. 
é. 


cU 
cf 


30. Let F be a family of sets. For each of the given equalities, prove true or show that 
it is false by finding a counter-example. 

(a) UP) =F 

(b) [\PA)=F 

(c) KUF)=F 

@ PN A=F 


31. Prove these alternate forms of De Morgan’s laws. 

(a) A\ Vier Bi= Nie A\ B; 

(b) A\ Mier Bi= Vier A\ B; 
32. Assume that the universe U contains only sets such that for all Ac U, A C U and 
P(A) € U. Prove the following equalities. 

(a) UU =U 

(b) (JU=2 


Section 3.4 FAMILIES OF SETS 159 
33. Let J, =[n,n+ 1] and J, =[n,n+ 1],n € Z. Define 
F={I,XJ,i nme Z}. 


Show (J ¥ = R’ and (1) F = @. Is F pairwise disjoint? 


CHAPTER 4 


RELATIONS AND FUNCTIONS 


4.1 RELATIONS 


A relation is an association between objects. A book on a table is an example of the 
relation of one object being on another. It is especially common to speak of relations 
among people. For example, one person could be the niece of another. In mathematics, 
there are many relations such as equals and less-than that describe associations between 
numbers. To formalize this idea, we make the next definition. 


@ DEFINITION 4.1.1 


A set R is an (n-ary) relation if there exist sets Ap, A,,...,A,_ such that 
RC Ap X Ay X+** X Ay_y. 


In particular, R is a unary relation if n = 1 and a binary relation if n = 2. If 
RC AXA for some set A, then R is a relation on A and we write (A, R). 


The relation on can be represented as a subset of the Cartesian product of the set of all 
books and the set of all tables. We could then write (dictionary, desk) to mean that the 


161 


A First Course in Mathematical Logic and Set Theory, First Edition. Michael L. O’ Leary. 
© 2016 John Wiley & Sons, Inc. Published 2016 by John Wiley & Sons, Inc. 


162 Chapter 4 RELATIONS AND FUNCTIONS 


dictionary is on the desk. Similarly, the set {(2, 4), (7, 3), (0, 0)} is a relation because 
it is a subset of Z x Z. The ordered pair (2,4) means that 2 is related to 4. Likewise, 
R X Q is a relation where every real number is related to every rational number, and 
according to Definition 4.1.1, the empty set is also a relation because @ = © x ©. 


M@ EXAMPLE 4.1.2 


For any set A, define 
I, = {(a,a):a€ A}. 


Call this set the identity on A. In particular, the identity on R is 
Ip = {(x,x): x ER}, 


and the identity on Z is 
Tz = {((x, x): x € Z}. 


Notice that @ is the identity on 2. 


M@ EXAMPLE 4.1.3 


The less-than relation on Z is defined as 
L={(a,b):a,be€ Zaa < 5}. 


Another approach is to use membership in the set of positive integers as our con- 
dition. That is, 
L={(a,b):a,b€ZAb-a€EZ*}. 


Hence, (4,7) € L because 7 — 4 € Zt. See Exercise 1 for another definition of 
L. 


When a relation R C A X B is defined, all elements in A or B might not be used. 
For this reason, it is important to identify the sets that comprise all possible values for 
the two coordinates of the relation. 


@ DEFINITION 4.1.4 
Let R C Ax B. The domain of R is the set 
dom(R) = {x € A: dyv € BA (x,y) € R)}, 
and the range of R is the set 


ran(R) = {y€ B: Ax(x E AA (x,y) € R)}. 


M@ EXAMPLE 4.1.5 


If R = {C1 3), (2,4), (2,5)}, then dom(R) = {1,2} and ran(R) = {3,4,5}. We 
represent this in Figure 4.1 where A and B are sets so that {1,2} C A and 
{3,4,5} © B. The ordered pair (1,3) is denoted by the arrow pointing from 
1 to 3. Also, both dom(R) and ran(R) are shaded. 


Section 4.1 RELATIONS 163 


€ | cy 


Figure 4.1 R= {(1,3), (2,4), (2,5)}. 


Mi EXAMPLE 4.1.6 


Let S = {(x,y): |x] = yA x,y € R}. Notice that both (2,2) and (—2, 2) are 
elements of S'. Furthermore, dom(S’) = R and ran(.8) = [0, 00). 


The domain and range of a relation can be the same set as in the next two examples. 


Mi EXAMPLE 4.1.7 
If R = {(, 1), CO, 2), (1, 0), (2, 0)}, then R is a relation on the set {0, 1,2} with 
dom(R) = ran(R) = {0, 1, 2}. 

M@ EXAMPLE 4.1.8 
For the relation 


S={@yeR: VP+y~P 21}, 


both the domain and range equal R. First, note that it is clear by the definition 
of .S that both dom(S) and ran(.$) are subsets of R. For the other inclusion, take 
x ER. Since (x, 1) € S, x € dom(S), and since (1, y) € S, y € ran(S). 


Composition 


Given the relations R and 5S, let us define a new relation. Suppose that (a, b) € S and 
(b,c) € R. Therefore, a is related to c through b. The new relation will contain the 
ordered pair (a, c) to represent this relationship. 


HM DEFINITION 4.1.9 


Let RC AX Band S C BxC. The composition of S and R is the subset of 
AX C defined as 


SoR= {(x,z): dyv€ BA(x,y)€ RAGV,Z) € S)}. 


As illustrated in Figure 4.2, the reason that (a,c) € Ro S is because (a, b) € S and 
(b,c) € R. That is, ais related to c via Ro S because a is related to b via S and then b 


164 Chapter 4 RELATIONS AND FUNCTIONS 


RoS 


Figure 4.2. A composition of relations. 


is related to c via R. The composition can be viewed as the direct path from a to c that 
does not require the intermediary b. 
@ EXAMPLE 4.1.10 
To clarify the definition, let 
R= {(2,4), (1, 3), (2, 5)} 
and 
S' = {(0, 1), (1, 0), O, 2), (2, 0)}. 
We have that R o S' = {(0, 3), (0, 4), (0,5)}. Notice that (0,3) € Ro S because 
(0,1) € S and (1,3) € R. However, S o R is empty since ran(R) and dom(S) 
are disjoint. 
WM EXAMPLE 4.1.11 


Define 
R={(x, WER? :x°4+y=1} 


and 
S={(x,y)€R?: y=x4+l1}. 


Notice that R is the unit circle and SS is the line with slope of 1 and y-intercept 
of (0, 1). Let us find Ro S. 


RoS = {(x,z) ER*: aW(y ERA (x,y) E SA(y,Z) € R)} 
={(x,2) €R?: dyyeERAy=x+layt+z =D} 
= {(x,z)ER?: (x4 1% +22 =1}. 


Section 4.1 RELATIONS 165 


Therefore, R o S is the circle with center (—1,0) and radius 1. 


Example 4.1.10 shows that it is possible that S o R # Ro.S. However, we can 
change the order of the composition. 


@ THEOREM 4.1.12 
If RC AXB, SC BxXC,andT C Cx D,thenT 0(S 0 R)=(ToS)oR. 


PROOF 
Assume that RC Ax B, SC Bx C,andT C C x D. Then, 


(a,d) €T o(S o R) 
© dc(c EC A(a,c) EE SoRA(c,d) ET) 
© dce(c EC CA AD(b E BA (a,b) € RA (b,c) € S)A(c,d) ET) 
eS dcdb(cE CADE BA(abDERA(bc) ES A(c,d) ET) 
© dbac(tbEe BA(a,b)E RACECA(b,c) ESA (c,d) €T) 
© db(be€ BA(a,b) € RAAc[c EC A (b,c) E S A(c,d) € T]) 
© 4b(be€ BA(a,b)€E RA(b,d)ET OS) 
©e(a,d)E(ToS)oR. 


Inverses 


Let RC AX B. We know that Ro I, = Rand Ip o R = R [Exercise 10(a)]. If we 
want I, = Ip, we need A = B so that R is a relation on A. Then, 


Rol,=1,0R=R. 


For example, if we again define R = {(2,4), (1,3), (2,5)} and view it as a relation on 
Z, then R composed on either side by I7 yields R. To illustrate, consider the ordered 
pair (1,3). It is an element of R o I7 because 


0,1) €17zAC0,3) € R, 
and it is also an element of [7 o R because 
(1,3) € RA (3,3) € Iz. 


Notice that not every identity relation will have this property. Using the same definition 
of R as above, 

Roly, =R, 
but 

Tp4 9; 0R=32. 


Now let us change the problem. Given a relation R on A, can we find a relation S 
on A such that Ro S = S o R= I,? The next definition is used to try to answer this 
question. 


166 Chapter 4 RELATIONS AND FUNCTIONS 


MH DEFINITION 4.1.13 
Let R be a binary relation. The inverse of R is the set 
Ro! = {(y, x): y) € R}. 


For a relation .S', we say that R and § are inverse relations if R~! = S. 


@ EXAMPLE 4.1.14 

Let L be the less-than relation on R (Example 4.1.3). Then, 
{(y,x) ER? : (x,y) EL} 
= {Q,x)ER*:x<y) 
{(y,x) ER? : y> x}. 


Lo! 


This shows that less-than and greater-than are inverse relations. 


We now check whether Ro R7! = R7! 0 R = I, for any relation R on A. Consider 
R= {(2, 1), (4, 3)}, which is a relation on {1,2,3,4}. Then, 


R™ = {(1,2),G,4)}, 
and we see that composing does not yield the identity on {1, 2, 3,4} because 
RoR = {(1,1),(3,3)} = 113) 


and 
Ro R= {(2,2), (4,4)} = Tyo). 


The situation is worse when we define S' = {(2, 1), (2,3), (4, 3)}. In this case, we have 
that 
S! = {(1,2),3,2),3,4)} 


but 
Sos? ={0,0;0,3),6:;6.3)} 


and 
SoS= {(2, 2), (2,4), (4, 4)}. 


Neither of these compositions leads to an identity, but at least we have that 
{(1, 1),,3)} € SoS! 


and 
{(2,2), (4,4)} C SoS. 


This can be generalized. 


M@ THEOREM 4.1.15 


Trancry & RO R7! and Tdomcr) © R7! o R for any binary relation R. 


Section 4.1 RELATIONS 167 


PROOF 
The first inclusion is proved in Exercise 12. To see the second inclusion, let 
x € dom(R). By definition, there exists y € ran(R) so that (x, y) € R. Hence, 
(y, x) € R7!, which implies that (x,x) € R7!o R. H 


Exercises 


1. Let L C ZX Z be the less-than relation as defined in Example 4.1.3. Prove that 


L={(a,b)€ZXZ:a-bETZ}. 


2. Find the domain and range of the given relations. 
(a) {(O, 1), (2, 3), (4, 5), 6, 7)} 
(b) {((a, 5), 1), (a,c), 2), ((a, d), 3)} 
(c) RxXZ 
(d) 0@x@ 
(e) Qx@ 
() {(,y): x,y € [0,1] Ax < y} 
(g) {(x,y) ER? : y=3} 
(h) {(@,y) €R?: y= |x} 
Gi) {(x,y) €R?:x2+y =4} 
@) {Gy ER sy < Vxax 20} 
(k) {(f,g): da € RUF (x) = e* A g(x) = ax)} 
() {((a,b),a+b):a,beZ} 


3. Write R o S as a roster. 
(a) R= {(1,0), (2,3), (4,6)}, S = {C, 2), 2, 3), 3,4} 
(b) R= {,3), (2,5), 3, D}, S = {, 3), 3, D, G6, 2)} 
(c) R= {(1, 2), 3,4), 6, 6)}, S = {C, 2), 3,4), , 6)} 
(d) R= {d,2),G,4), 5, 6)}, S = {@, 1), 3,5),6,7)} 


4. Write R o S using abstraction. 
(a) R={(@x,y)ER?:x?+y=1} 


S=R 
(b) R={(x,y)ER?:x27+y=1} 
S=ZxZ 


(cc) R={(@%,y)ER?: x+y =1} 
S = {(x,y) eR? :(x-2)? +y =1} 
(dd) R={(xy,yeER:x+y=1} 
S = {(x,y)€R*: y=2x-1} 
5. Write the inverse of each relation. Use the abstraction method where appropriate. 
(a) @ 
(b) Iz 
(c) {(1,9), (2, 3), (4, 6)} 


168 Chapter 4 RELATIONS AND FUNCTIONS 


(d) {(1,0), C1, 1), (2, 1)} 
(e) ZXR 
(f) {(x,sinx):x € R} 
(g) (x, y)ER?:x+y=1) 
(h) (x, ER? 2x? +y =1) 
6. Let RC AX Band S C BXC. Show the following. 
(a) Robo S!CCXA. 
(b) dom(R) = ran(R7!). 
(c) ran(R) = dom(R7!). 
7. Prove that if R is a binary relation, (Ro!) = R. 
8. Prove that (So R)!=R!o Sif RC Ax BandS C BXC. 


9. Let R,S C Ax B. Prove the following. 
(a) If RCS, then R7! c S7!, 
(b) (RUS)! =R"uUSH!, 
(c) (RNS)! = RI n SH. 
10. LetRCAXB. 
(a) Prove Rol, = RandIpoR=R. 
(b) Show that if there exists a set C such that A and B are subsets of C, then 
Roloc =1IcoR=R. 


11. Let R C AX Band S C Bx C. Show that S o R = @ if and only if dom(S) and 
ran(R) are disjoint. 


12. For any relation R, prove Iran(r) G Ro RT. 


13. Let RC Ax B. Prove. 
(a) Usen{x € A: (x, b) € R} = dom(R). 
(b) Usealy € B: (ay) © R} = ran(R). 


4.2 EQUIVALENCE RELATIONS 


In practice we usually do not write relations as sets of ordered pairs. We instead write 
propositions like 4 = 4 or 3 < 9. To copy this, we will introduce an alternate notation. 
Hi DEFINITION 4.2.1 
Let R be arelation on A. For all a,b € A, 
a Rb if and only if (a, b) € R, 
and 
aR bif and only if (a,b) Z R. 


For example, the less-than relation L (Example 4.1.3) is usually denoted by <, and we 
write 2 < 3 instead of (2,3) € L or (2,3) € <. 


Section 4.2 EQUIVALENCE RELATIONS 169 


Mi EXAMPLE 4.2.2 
Define the relation R on Z by 


R= {(a,b) €ZXZ: 4c € Lb =ac AaF#0)}. 


Therefore, for all a,b € Z, a Rb if and only if a divides b. Therefore, 4 R 8 but 
8R4. 
Relations can have different properties depending on their definitions. Here are three 
important examples using relations on A = {1,2,3}. 


e {(, 1), (2, 2), (3, 3)} has the property that every element of A is related to itself. 


e {(, 2), (2, 1), (2,3), (3, 2)} has the property that if a is related to b, then b is re- 
lated to a. 


e {(1, 2), (2, 3), 1, 3)} has the property that if a is related to b and b is related to c, 
then a is related to c. 


These examples lead to the following definitions. 


HM DEFINITION 4.2.3 


Let R be a relation on A. 


e Ris reflexive if a Ra for alla € A. 
e Ris symmetric when for all a,b € A, if a Rb, then b Ra. 
¢ Ris transitive means that for all a,b,c € A,ifaRbandb Rc, thena Re. 


Notice that the relation in Example 4.2.2 is not reflexive because 0 does not divide 0 
and is not symmetric because 4 divides 8 but 8 does not divide 4, but it is transitive 
because if a divides b and b divides c, then a divides c. 

When a relation is reflexive, symmetric, and transitive, it behaves very much like an 
identity relation (Example 4.1.2). Such relations play an important role in mathematics, 
so we name them. 


@ DEFINITION 4.2.4 


A relation R on A is an equivalence relation if R is reflexive, symmetric, and 
transitive. 


Observe that the relation in Example 4.2.2 is not an equivalence relation. However, any 
identity relation is an equivalence relation. We see this assumption at work in the next 
example. 


170 Chapter 4 RELATIONS AND FUNCTIONS 


EXAMPLE 4.2.5 
Let R be arelation on Z x (Z \ {0}) so that for all a,c € Zand b,d € Z \ {0}, 
(a, b) R (c, d) if and only if ad = be. 


To see that this is an equivalence relation, let (a, b), (c,d), and (e, f) be elements 
of Z x (Z \ {0}). 


e (a,b) R(a, b) since ab = ab. 


e Assume (a,b) R(c,d). Then, ad = bc. This implies that ch = da, so 
(c,d) R(a, b). 


e Let (a,b) R(c,d) and (c,d) R(e, f). This gives ad = be andcf = de. 
Therefore, (a, b) R (e, f) because 


EXAMPLE 4.2.6 


Take m € Z* and let a, b, and c be integers. Define a to be congruent to b 
modulo m and write 


a = b (mod m) if and only if m | a—b. 
That is, 
a = b (mod m) if and only if a = b+ mk for some k € Z. 


For example, we have that 7 = | (mod 3), 1 = 13 (mod 3), and 27 = 0 (mod 3), 
but 2 4 9 (mod 3) and 25 # 0 (mod 3). Congruence modulo m defines the 
relation 

Ry = {(a, b) : a= b (mod m)}. 


Observe that 
Ry, = {(a, 5): m| a— b} = (a,b): Ak(k € ZAa=b+mk)}. 
Prove that R,,, is an equivalence relation. 
° (a,a) € R,, because a = a (mod m). 


e Assume (a,b) € R,,. This implies that m | a — b. By Exercise 2.4.18, 
m|b—a. Hence, (b,a) € R,,. 


¢ Let (a,b), (b,c) € R,,. Then a = b+ mk and b =c+mi for some k,! € Z. 
Substitution yields 


a=ct+ml+mk=c+mi +k). 


Since the sum of two integers is an integer, (a,c) € R,,. 


Section 4.2 EQUIVALENCE RELATIONS 171 


Equivalence Classes 
Let R be a relation on {1, 2,3, 4} such that 
R= {(1, 2), 0,3), (2,9)}. (4.1) 


Observe that | is related to 2 and 3, 2 is related to 4, and 3 and 4 are not related to any 
number. Combining the elements that are related to a particular element results in a set 
named by the next definition. 


@ DEFINITION 4.2.7 
Let R be a relation on A with a € A. The class of a with respect to R is the set 
[aqlp={xEA:aRx}. 


If R is an equivalence relation, [a]p is called an equivalence class. We often 
denote [a] p by [a] if the relation is clear from context. 


Using R as defined in (4.1), 
[le = {2,3}, 2] = {4}, and [3] zp = [4]z = 9. 


If R had been an equivalence relation on a set A, then [a] p would be nonempty for all 
a € A because a would be an element of [a] p (Exercise 17). 


Mi EXAMPLE 4.2.8 
Let R be the equivalence relation from Example 4.2.5. We prove that 
[(1, 3)] = {(m, 3n) :n € Z\ {0}}. 


To see this, take (a, b) € [(1,3)]. This means that (1,3) R (a, b), so b = 3a and 
a # 0 because b # 0. Hence, (a,b) = (a,3a). Conversely, let n # 0. Then, 
(1,3) R (n, 3n) because 1 - 3n = 3 -n. Thus, (n, 3n) € [(1, 3)]. 

EXAMPLE 4.2.9 


Using the notation of Example 4.2.6, let R,,, be the relation defined by congruence 
modulo m. For all n € Z, define 


[1] = Ry (n). 


Therefore, when m = 5, the equivalence classes are: 


[0]; = {...,-10,-5, 0,5, 10,...}, 
[1]; = {...,-9, -4, 1,6, 11,...}, 
(2) fir 893,257, 12h ah 
[3]; = {...,-7,-2, 3,8, 13, ... }, 
[4], = {...,-6,-1,4,9, 14,...} 


172 Chapter 4 RELATIONS AND FUNCTIONS 


In addition, 


[a]; = [b],; if and only if a = b (mod 5). 


The collection of all equivalence classes of a relation is a set named by the next 
definition. 


@ DEFINITION 4.2.10 
Let R be an equivalence relation on A. The quotient set of A modulo R is 


A/R= {[a]p:a€ A}. 


Observe by Exercise 3 that it is always the case that 


A= U A/R. (4.2) 


M@ EXAMPLE 4.2.11 


Let m € Z*. The quotient set Z/R,,, is denoted by Z,,. That is, 


Zin = {(0]mn> ns ++ > — Un }- 


WM EXAMPLE 4.2.12 
Define the relation R on R? by 
(a, b) R (c, d) if and only ifb-a=d—c. 
R is an equivalence relation by Exercise 2. We note that for any (a, b) € R’, 


[(a, b)] = {(x, y) : (x, y) R (a, b)} 
={,y): y-x=b-a} 
= {(x,y): y=x+(b—a)}. 


Therefore, the equivalence class of (a, b) is the line with a slope of 1 anda y- 
intercept equal to (0, b — a). The equivalence classes of (0, 1.5) and (0, —1) are 
illustrated in the graph in Figure 4.3. The quotient set R? / Ris the collection of 
all such lines. Notice that 


R? =| JR’/R. 


Partitions 


In Example 4.2.12, we saw that R’ is the union of all the lines with slope equal to 1, and 
since the lines are parallel, they form a pairwise disjoint set. These properties can be 
observed in the other equivalence relations that we have seen. Each set is equal to the 
union of the equivalence classes, and the quotient set is pairwise disjoint. Generalizing 
these two properties leads to the next definition. 


Section 4.2 EQUIVALENCE RELATIONS 173 


R(O, 1.5) 


Figure 4.3. Two equivalence classes in R? when (a, b) R (c,d) if and only ifb-a=d-c. 


MH DEFINITION 4.2.13 
Let A be a nonempty set. The family ¥ is a partition of A if and only if 


> PC P(A), 
* UPF=A, 
¢ Ff is pairwise disjoint. 


To illustrate the definition, let A = {1,2,3,4,5,6,7} and define the elements of the 
partition to be Ap = {1,2,5}, Ay = {3}, and A, = {4,6,7}. The family 


is a subset of P(A), A = Ag U A, U Ap, and F is pairwise disjoint. Therefore, F is a 
partition of A. This is illustrated in Figure 4.4. 


Ay Ag 


A3 


Figure 4.4 A partition of the set A = {1,2,3,4,5,6,7}. 


174 Chapter 4 RELATIONS AND FUNCTIONS 


Mi EXAMPLE 4.2.14 


For each real number r > 0, define C, to be the circle with radius r centered at 
the origin. Namely, 


c.={@»: Ver yrart. 


Let @ = {C,: r € [0, c)}. We claim that @ is a partition of R’. 
- @ C P(R’) because C, C R’ for all r > 0. 


+ To prove that R? = U,<[0,00) C» it suffices to show that Rc etnies 
but this follows because if (a, b) € R’, then 


(a, b) E C ae 


¢ To see that @ is pairwise disjoint, let r,s > 0 and assume that (a, b) is an 
element of C,m C,. Then, 


r=Va+b=s, 


which implies that C, = C,. 


The set Z; is a family of subsets of Z, has the property that JZ; = Z, and is 
pairwise disjoint. Hence, Z; is a partition for Z. We generalize this result to the next 
theorem. It uses an arbitrary equivalence relation on a given set to define a partition 
for that set. In this case, we say that the equivalence relation induces the partition. 


M@ THEOREM 4.2.15 


If R is an equivalence relation on A, then A/R is a partition of A. 


PROOF 
Take a set A with an equivalence relation R. 


¢ Since an equivalence class is a subset of A, we have A/R C P(A). 
¢ U A/R = Ais (4.2). 


¢ Let [a],[b] € A/R and assume that there exists y € [a] N [b]. In other 
words, aR y and b Ry. Now take x € [a]. This means that a Rx. Since 
x Raand y Rb by symmetry, we have that x R y, and then x Rb by tran- 
sitivity. Thus, x € [b], which shows [a] € [b]. Similarly, [b] € [a], so 
[a] = [5]. & 


The collection of equivalence relations forms a partition of a set. Conversely, if we 
have a partition of a set, the partition gives rise to an equivalence relation on the set. 
To see this, take any set A and a partition Y of A. For all a,b € A, define 


a Rb if and only if there exists C € F such that a,b E C. 


Section 4.2 EQUIVALENCE RELATIONS 175 


To show that R is an equivalence relation, take a, b, and c in A. 


¢ Since a € Aand A = |) &, there exists C € ¥ such that a € C. Therefore, 
aRa. 


e Assume a Rb. This means that a,b € C for some C € &. This, of course, is 
the same as b,a € C. Hence, b Ra. 


e Suppose a Rband b Rc. Then, there are sets C and D in F so that a,b € C and 
b,c € D. This means that Cn D # ©. Since F is pairwise disjoint, C = D. So, 
a and c are elements of C, and we have a Rc. 


This equivalence relation is said to be induced from the partition. 


Mi EXAMPLE 4.2.16 


The sets 
{...,-10,-5,0,5,10,... }, 
{...,-9,-4, 1,6, 11,... }, 
{...,-8, -3, 2,7, 12,... }, 
{...,-7, -2,3, 8, 13,... }, 
{...,-6,-1,4,9, 14,... }, 


form a collection that is a partition of Z. The equivalence relation that is induced 
from this partition is congruence modulo 5 (Example 4.2.9). 


Exercises 


1. Forall a,b © R \ {0}, let a Rb if and only if ab > 0. 
(a) Show that R is an equivalence relation on R \ {0}. 
(b) Find [1] and [—3]. 


2. Define the relation S on R* by (a, b) S (c, d) if and only if b— a = d —c. Prove that 
S' is an equivalence relation. 


3. Let S' be an equivalence relation on A. Prove that A = J acalal. 
4. Prove that if C is an equivalence class for some equivalence relation R and a € C, 
then C = [a]. 


5. For all a,b € Z, let a R b if and only if |a| = ||. 
(a) Prove R is an equivalence relation on Z. 
(b) Sketch the partition of Z induced by this equivalence relation. 


6. For all (a, b), (c,d) € Z x Z, define (a, b) S (c,d) if and only if ab = cd. 
(a) Show that S' is an equivalence relation on Z x Z. 


(b) What is the equivalence class of (1,2)? 
(c) Sketch the partition of Z x Z induced by this equivalence relation. 


7. Let A beaset anda € A. Show that the given relations are not equivalence relations 
on P(A). 


176 Chapter 4 RELATIONS AND FUNCTIONS 


(a) Forall C,D C A, define C R Dif and only if CN DF @. 
(b) Forall C, D C A, define C'S D if and only ifa E CN D. 


8. Let z,z’ € C and write z = a+ bi and z’ = a’ + b’i. Define z = z’ to mean that 
a=a' and b =D’. Prove that R is an equivalence relation. 


9. Find. 
(a) [3]5 
(b) [12]¢ 
(c) [2]; U[27]5 
) [4h95) 


10. Let r be the remainder obtained when n is divided by m. Prove [n],, = [F],,- 


11. Let c,m € Zand suppose that gcd(c, m) = 1. Prove. 
(a) There exists b such that bc = 1 (mod m). (Notice that 5 is that multiplicative 
inverse of c modulo m.) 
(b) If ca =cb (mod m), then a = b (mod m). 
(c) Prove that the previous implication is false if gcd(c, m) # 1. 


12. Prove that {(m,n +1] :n € Z} is a partition of R. 


13. Prove that the following are partitions of R’. 
(a) P={{(ab)}:a,bER}. 
(b) Y={{(r,y): ye R}:reER}. 
(c) H={RX(a,n+1] :n EZ}. 


14. Is {[n,n+1] x (a,n4+ 1): € Z} a partition of R*? Explain. 


15. Let R be a relation on A and show the following: 
(a) Ris reflexive if and only if R7! is reflexive. 
(b) Ris symmetric if and only if R = R7!. 
(c) Ris symmetric if and only if (A x A) \ R is symmetric. 


16. Let R and S be equivalence relations on A. Prove or show false. 
(a) RUS is an equivalence relation on A. 
(b) RAS is an equivalence relation on A. 


17. Prove for all relations R on A. 
(a) Ris reflexive if and only if Va(a € [a]). 
(b) Ris symmetric if and only if VaVb(a € [b] © b € [a]). 
(c) Ris transitive if and only if VaVbVc([b € [a] Ac € [b]] > c € [a]). 
(d) Ris an equivalence relation if and only if VaVb((a, b) € R © [a] = [b]). 


18. Let R be arelation on A with the property that if a R band b Rc, then c Ra. Prove 
that if R is also reflexive, R is an equivalence relation. 


19. Define the relation R on C by a+ bi Rc + di if and only if 


Va2 + b2 = Vc? +2. 


Section 4.3 PARTIAL ORDERS 177 


(a) Prove R is an equivalence relation on C. 
(b) Graph [1 + i] in the complex plane. 
(c) Describe the partition that R induces on C. 


20. Let R and S be relations on A. The symmetric closure of R is S if R C S and 
for all symmetric relations T on A such that R C T, then S' C T. Prove the following. 
(a) RU R7! is the symmetric closure of R. 
(b) A symmetric closure is unique. 


4.3. PARTIAL ORDERS 


While equivalence relations resemble equality, there are other common relations in 
mathematics that we can model. To study some of their attributes, we expand Defini- 
tion 4.2.3 with three more properties. 


@ DEFINITION 4.3.1 


Let R be a relation on A. 
¢ Ris irreflexive if a R a for alla € A. 
e Ris asymmetric when for all a,b € A, if a Rb, then b Ra. 


e Ris antisymmetric means that for all a,b € A, if aRb and b Ra, then 
a=b. 


Notice that a relation on a nonempty set cannot be both reflexive and irreflexive. How- 
ever, many relations have neither property. For example, consider the relation R = 
{(1,1)} on {1,2}. Since (1,1) € R, the relation R is not irreflexive, and R is not 
reflexive because (2,2) ¢ R. Likewise, a relation on a nonempty set cannot be both 
symmetric and asymmetric. 


M@ EXAMPLE 4.3.2 


The less-than relation on Z is irreflexive and asymmetric. It is also antisymmet- 
ric. To see this, let a,b € Z. Since a < band b < ais false, the implication 


ifa< bandb <a, thena=b 


is true. The < relation is also antisymmetric. However, < is neither irreflexive 
nor asymmetric since 3 < 3. 


M@ EXAMPLE 4.3.3 


Let R = {(1,2)} and S = {(1, 2), (2, 1)}. Both are relations on {1,2}. The first 
relation is asymmetric since | R2 but 2 R 1. It is also antisymmetric, but S is 
not antisymmetric because 2S 1 and 1 S'2. Both the relations are irreflexive. 


178 Chapter 4 RELATIONS AND FUNCTIONS 


M@ EXAMPLE 4.3.4 


Let R be arelation on a set A. We prove that 
R is antisymmetric if and only if RN R=! C Ty. 


e Assume that R is antisymmetric and take (a,b) € RN R7!. This means 
that (a,b) € Rand (a,b) € R7!. Therefore, (b,a) € R, and since R is 
antisymmetric, a = b. 


e Now suppose RN Ric I,. Let (a, b),(b,a) € R. We conclude that 
(a,b) € R7!, which implies that (a,b) € RN R7!. Then, (a,b) € T,. 
Hence, a = b. 


As an equivalence relation is a generalization of an identity relation, the following 
relation is a generalization of < on N. For this reason, instead of naming the relation 
R, it is denoted by the symbol =. 


HM DEFINITION 4.3.5 


If arelation < ona set A is reflexive, antisymmetric, and transitive, < is a partial 
order on A and the ordered pair (A, =) is called a partially ordered set (or 
simply a poset). Furthermore, for all a,b € A, the notation a < b means a = b 
but a # Db. 


For example, < and = are partial orders on R, but < is not a partial order on R because 
the relation < is not reflexive. Although = is a partial order on any set, in general an 
equivalence relation is not a partial order (Example 4.2.6). 


EXAMPLE 4.3.6 


Divisibility (Definition 2.4.2) is a partial order on Z*. To prove this, let a, b, and 
c be positive integers. 


e a|asincea=a-landa#0. 


¢ Suppose that a | 6 and b | a. This means that b = ak and a = bl, for some 
k,l © Z*. Hence, b = bik, so 1k = 1. Since k and / are positive integers, 
1=k=1. Thatis,a=b. 


e Assume that we have a | 6 and b | c. This means that b = al and c = bk 
for some I, k € Z*. By substitution, c = (al)k = a(/k). Hence, a | c. 


@ EXAMPLE 4.3.7 


Let A be a collection of symbols and let A* denote the set of all strings over A 
(as on page 5). Use the symbol LI to denote the empty string, the string of length 
zero. As with the empty set, the empty string is always an element of A*. For 
example, if A = {a,b,c}, then abc, aaabbb, c, and U are elements of A*. Now, 
take o,t € A*. The concatenation of o and rt is denoted by o~r and is the 


Section 4.3 PARTIAL ORDERS 179 


000 001 010 O11 100 101 110 111 
00 01 10 11 
Te ee Tale 
0 1 
va, soe 
O 


Figure 4.5 A partial order defined on {0, 1}*. 


string consisting of the elements of o followed by those of t. For example, if 
o = 011 and rt = 1010, then o~r = 0111010. Finally, for all o, rt € A*, define 


o =< 7 if and only if there exists v € A* such that tr = ov. 


It can be shown that = is a partial order on A* (Exercise 8) with the structure 
seen in Figure 4.5 for A = {0, 1}. 


The partial order < on R has the property that for all a,b € R, either a < b, b < a, 
or a = b. As we see in Figure 4.5, this is not the case for every partially ordered set. 
We do, however, have the following slightly weaker property. 


@ THEOREM 4.3.8 [Weak-Trichotomy Law] 


If = is a partial order on A, for all a, b € A, at most one of the following are true: 
a<b,b<a, ora=b. 


PROOF 
Let a,b € A. We have three cases to consider. 


¢ Suppose a < b. This means that a =< b and a ¥ b. If in addition b < a, by 
transitivity a < a, which is a contradiction. 


¢ That 5 < a precludes both a < b and a = bis proved like the first case. 


¢ If a = 5, then by definition of < it is impossible for a < b or b < ato be 
true. Hi 


Technically, subset is not a relation, but in a natural way, it can be considered as one. 
Let F be a family of sets. Define 


S={(A,B):ACBAA, BEF}. 
Associate € with the relation S. 


Mi EXAMPLE 4.3.9 


Let A be a nonempty set. We show that (P(A), C) is a partially ordered set. Let 
B,C, and D be subsets of A. 


180 Chapter 4 RELATIONS AND FUNCTIONS 


e Since B C B, the relation C is reflexive. 


e Since B C C andC C B implies that B = C (Definition 3.3.7), C is 
antisymmetric. 


e Since B C C andC C D implies B C D (Theorem 3.3.6), we see that C is 
transitive. 


This example is in line with what we know about subsets. For instance, if A C B, 
then we conclude that B ¢ A and A ¥ B, which is what we expect from the 
weak-trichotomy law (4.3.8). 


Bounds 


Let < be a partial order on A with elements m and m’ such that a < manda < m! 


for alla € A. In particular, this implies that m < m’ and m’ < m, so since < is 
antisymmetric, we conclude that m = m’. Similarly, if m < aand m’ < a for alla € A, 
then m = m’. This argument justifies the use of the word the in the next definition. 


Hi DEFINITION 4.3.10 
Let (A, x) be a poset and m € A. 
e mis the least element of A (with respect to x) ifm = aforalla€ A. 


¢ mis the greatest element of A (with respect to <) if a x mforallae€ A. 


There is no guarantee that a partially ordered set will have a least or a greatest element. 
In Example 4.3.9, the greatest element of P(A) with respect to C is A and the least 
element is @. However, in Example 4.3.7 (Figure 4.5), the least element of A* is 
but there is no greatest element. 


M@ EXAMPLE 4.3.11 


If A is a finite set of real numbers, A has a least and a greatest element with 
respect to <. Under the partial order <, the set Z* also has a least element but no 
greatest element, and both {5n :n € Z } and Z have greatest elements but no 
least elements. To show that B = {5n: n € Z*} has a least element with respect 
to <, use the fact that the least element of Z* is 1. Therefore, the least element 
of B is 5 because 5 € B and 5(1) < 5n foralln € Z*. 


Some sets will not have a least or greatest element with a given partial order but 
there will still be elements that are considered greater or lesser than every element of 
the set. 


HM DEFINITION 4.3.12 


Let = be a partial order on A and BC A. 


¢ u € Ais an upper bound of B if b = u forall b € B. The element u is the 
least upper bound of B if it is an upper bound and for all upper bounds u’ 
of Bhux<w'. 


Section 4.3 PARTIAL ORDERS 181 


e | € Aisa lower bound of B if / = b for all b € B. The element / is the 
greatest lower bound of B if it is a lower bound and for all lower bounds 
I’ of BI! <1. 


In the definition we can write the word the because the order is antisymmetric. 


M@ EXAMPLE 4.3.13 


The interval (3,5) is a subset of R. Under the partial order <, both 5 and 10 are 
upper bounds of this interval, while 5 is a least upper bound. Also, 3 and —z are 
lower bounds, but 3 is the greatest lower bound. 


M@ EXAMPLE 4.3.14 


Assume that @ C P(Z). Since the elements of @ are subsets of Z, we conclude 
that [J @ € P(Z) (Definition 3.4.8). If A € @, then A C |) @. Therefore, |) @ 
is an upper bound of @ with respect to C. To see that it is the least upper bound, 
let U be any upper bound of @. Take x € |) @. This means that there exists 
D € @ such that x € D. Since U is an upper bound of @, D C U. Hence, 
x € U, and we conclude that [) @ CU. 


Comparable and Compatible Elements 


In Figure 4.5, we see that 01 =< 010 and 01 < 011. However, 010 4 011 and 011 ¥ 
010. This means that in the poset of Example 4.3.7, there are pairs of elements that are 
related to each other and there are other pairs that are not. 


@ DEFINITION 4.3.15 


Let (A, =) be a poset and a,b € A. If a = bor b X a, then a and b are com- 
parable with respect to x. Elements of A are incomparable if they are not 
comparable. 


Continuing our review of the partially ordered set of Example 4.3.7, we note that 
the element LJ has the property that no element is less than it, but as seen in Figure 4.5, 
for every element of A*, there exists an element of A” that is greater. However, that 
same relation defined on 


A = {o € A* : o has at most 3 characters} (4.3) 


has the property that 000, 001, 010, 011, 100, 101, 110, 111 have no elements greater 
than them. This leads to the next definition. 


@ DEFINITION 4.3.16 
Let (A, <) be a poset and m € A. 
e mis a minimal element of A (with respect to x) ifa A mfor alla € A. 


e mis a maximal element of A (with respect to <) if m 4 a for alla € A. 


182 Chapter 4 RELATIONS AND FUNCTIONS 


Therefore, the empty string is a minimal element of A (4.3), and 000, 001, 010, 011, 
100, 101, 110, 111 are maximal. Notice that every least element is minimal and every 
greatest element is maximal. 

Although not every pair of elements is comparable in the partially ordered set of 
Example 4.3.7, there are infinite sequences of comparable elements, such as 


<0<01 <001 ~0001 x---. 


HM DEFINITION 4.3.17 


A subset C of the poset (A, =<) is a chain with respect to = if a is comparable to 
b for alla,beEC. 


When Z is partially ordered by <, the sets {0, 1, 2,3,...}, {...,-3, -2, -1,0}, and 
{...,—-2,0,2,4,... } are chains. In P(Z), both 


11, 2},(42, 3.44 (8,2; 3,4,5,63} 
and 
{@, {0}, {0,1}, {0,1,2},... } 
are chains with respect to C. 
@ EXAMPLE 4.3.18 
To see that {A, : k © Z*} where A, = {x €Z: (x-—1)(x—-2)---(x-k) =0} 
is a chain with respect to C, take m,n € Z*. By definition, 
A, = {1,2,...,m} 
and 
A, = {1,2,...,n}. 


If m <n, then A,, C A,, otherwise A, C A,,. 


Mi EXAMPLE 4.3.19 


Let Cp and C, be chains of A with respect to x. Take a,b € CyMC,. Then, 
a,b € Cy, soa = bor b = a. Therefore, Cy MC; is a chain. However, the union 
of two chains might not be a chain. For example, {{1}, {1,2}} and {{1}, {1,3}} 
are chains in (P(Z), C), but {{1}, {1,2}, {1,3}} is not a chain because {1,2} Z 
{1,3} and {1,3} Z {1,2}. 


The sets Z, Q, and R are chains of R with respect to <. In fact, any subset of R is a 
chain of R because subsets of chains are chains. This motivates the next definition. 


HM DEFINITION 4.3.20 


The poset (A, =<) is a linearly ordered set and = is a linear order if A is a chain 
with respect to =. 


Section 4.3 PARTIAL ORDERS 183 


Since every subset A of R is a chain with respect to <, the relation < is a linear order 
on A. Furthermore, since every pair of elements in a linear order are comparable, 
Theorem 4.3.8 can be strengthened. 


@ THEOREM 4.3.21 [Trichotomy Law] 


If = is a linear order on A, for all a, b € A, exactly one of the following are true: 
a<b,b<a,ora=b. 


Although it is not the case that every pair of elements of the poset defined in Exam- 
ple 4.3.7 is comparable, it is the case that for any given pair of elements, there exists 
another element that is related to the given elements. For example, for the pair 100 and 
110, 1 = 100 and 1 = 110. Also, for the pair 101 and 10, 10 = 101 and 10 = 10. 


@ DEFINITION 4.3.22 


Let (A, <) bea poset. The elements a, b € A are compatible if there exists c € A 
such that c < a and c = b. If a and b are not compatible, they are incompatible 
and we write a L b. 


Observe that if a and b are comparable, they are also compatible. On the other hand, it 
takes some work to define a relation in which every pair of elements is incompatible. 


HM DEFINITION 4.3.23 


A subset D of a poset (A, <) is an antichain with respect to < when for all 
a,be D,ifa#b, thena 1 b. 


The sets 
{{n}: ne Z} 


and 
{{1}, {2,3}, {4,5, 6}, {7,8,9, 10}, ...} 


are antichains of P(Z) with respect to C. 


Well-Ordered Sets 


The notion of a linear order incorporates many of the properties of < on N since < is 
reflexive, antisymmetric, and transitive, and N is a chain with respect to <. However, 
there is one important property of (N, <) that is not included among those of a linear 
order. Because of the nature of the natural numbers, every subset of N that is nonempty 
has a least element. For example, 5 is the least element of {5,7,32,99} and 2 is the 
least element of {2,4, 6,8, ... }. We want to be able to identify those partial orders that 
also have this property. 


M@ DEFINITION 4.3.24 


The linearly ordered set (A, =) is a well-ordered set and = is a well-order if 
every nonempty subset of A has a least element with respect to =. 


184 Chapter 4 RELATIONS AND FUNCTIONS 


According to Definition 4.3.24, (N, <) is a well-ordered set, but this fact about the 
natural numbers cannot be proved without making an assumption. Therefore, so that 
we have at least one well-ordered set with which to work, we assume the following. 


M@ AXIOM 4.3.25 


(N, <) is a well-ordered set. 


Coupling Axiom 4.3.25 with the next theorem will yield infinitely many well-ordered 
sets. 


M@ THEOREM 4.3.26 


If (A, X) is well-ordered and B is a nonempty subset of A, then (B, =) is well- 
ordered. 


PROOF 
Let B C Aand B ©. To prove that B is well-ordered, let C C BandC 4 ©. 
Then, C C A. Since = well-orders A, we know that C has a least element with 
respect to =<. 


Because ZN [5,co) C Zt CN, by Axiom 4.3.25 and Theorem 4.3.26, both (Zt, <) 
and (ZN [5, oo) , <) are well-ordered sets. 


WM EXAMPLE 4.3.27 


Let A = {na :n EN}. To prove that A is well-ordered by <, let B € A such 
that B # @. This means that there exists a nonempty subset J of N such that 
B= {nz:n€T}. Since N is well-ordered, I has least element m. We claim 
that mz is the least element of B. To see this, take b € B. Then, b = iz for some 
i € I. Since mis the least element of J, m < i. Therefore, mz < iz = b. 


To prove that a set A is not well-ordered by =<, we must find a nonempty subset B 
of A that does not have a least element. This means that for every b € B, there exists 
c € B such that c < b. That is, there are elements b, € B (n € N) such that 


-++< by < by < do. 
This informs the next definition. 


@ DEFINITION 4.3.28 


Let = be a partial order on A and B = {a, : i € N} bea subset of A. 
¢ Bis increasing means i < j implies a; < a; for alli, j EN. 
+ B is decreasing means i < j implies a; < a; for alli, j E N. 


If a set is well-ordered, it has a least element, but the converse is not true. To see 
this, consider A = {0,1/2, 1/3, 1/4, ... }. It has a least element, namely, 0, but A also 


Section 4.3 PARTIAL ORDERS 185 


contains the decreasing set 
11141 
{sogepgee } (4.4) 
2345 
Therefore, A has a subset without a least element, so A is not well-ordered. We sum- 
marize this observation with the following theorem, and leave its proof to Exercise 23. 


M@ THEOREM 4.3.29 


(A, =) is not a well-ordered set if and only if (A, =) does not have a decreasing 
subset. 


Theorem 4.3.29 implies that any finite linear order is well-ordered. 


M@ EXAMPLE 4.3.30 


The decreasing sequence (4.4) with Theorem 4.3.29 shows that the sets (0, 1), 
[0, 1], Q, and {1/n:n € Z*} are not well-ordered by <. 


We close this section by proving two important results from number theory. Their 
proofs use Axiom 4.3.25. The strategy is to define a nonempty subset of a well-ordered 
set. Its least element r will be a number that we want. This least element also needs 
to have a particular property, say p(r). To show that it has the property, assume —p(r) 
and use this to find another element of the set that is less than r. This contradicts the 
minimality of r allowing us to conclude p(r). 


@ THEOREM 4.3.31 [Division Algorithm] 


If m,n € N with m ¥ 0, there exist unique g,r € N such that r < m and 
n=mqe+r. 


PROOF 
Uniqueness is proved in Example 2.4.14. To prove existence, take m,n € N and 
define 
S={kEN: ad eNAn=nl+k)}. 


Notice that n € S,so S 4 ©. Therefore, S has a least element by Axiom 4.3.25. 
Call it r and write n = mq +r for some natural number q. Assume r > m, which 
implies that r —m > 0. Also, 


n=m(q-l1)+r—m 
because 
r—-m=n—mq-—m=n-mqt)l), 
sor—me€. Since r > r — m because m is positive, r cannot be the minimum 


of .S, a contradiction. Hf 


The value q of the division algorithm (Theorem 4.3.31) is called the quotient and r is 
the remainder. For example, if we divide 5 into 17, the division algorithm returns a 
quotient of 3 and a remainder of 2, so we can write that 17 = 5(3) + 2. Notice that 
2 <3. 


186 Chapter 4 RELATIONS AND FUNCTIONS 


Call n € Z a linear combination of the integers a and b if n = ua + vb for some 
u,v € Z. Since 37 = 5(2) + 3(9), we see that 37 is a linear combination of 2 and 9. 
Furthermore, if d | a and d | b, then d | ua + vb. To see this, write a = d/ and b = dk 
for some /,k € Z. Then, 


ua + vb = udl + vdk = d(ul + vk), 
and this means d | ua + vb. 


@ THEOREM 4.3.32 


Let a, b € Z with not both equal to 0. If c = gcd(a, b), there exists m,n € Z such 
that c = ma+nb. 


PROOF 
Define 
T ={zEZ* : Axdyx,yeEZAz=xatyb)}. 
Notice that T is not empty because a* + b* € T. By Axiom 4.3.25, T has a least 
element d, so write d = ma + nb for some m,n € Z. 


e Since d > 0, the division algorithm (4.3.31) yields a = dq +r for some 
natural numbers g and r with r < d. Then, 


r=a-—dq=a-—(ma-—nb)q = (1 — mq)a + (nq)b. 


If r > 0, then r € T, which is impossible because d is the least element of 
T. Therefore, r = 0 and d | a. Similarly, d | b. 


¢ To show that d is the greatest of the common divisors, suppose s | a and 
s | bwith s € Zt. By definition, a = sk and b = s/ for some k,! € Z. 
Hence, 
d = m(sk) + n(sl) = s(mk — nl). 


Thus, s < d because s is nonzero and s divides d (Exercise 25). 


Exercises 
1. Is (@, C) a partial order, linear order, or well order? Explain. 


2. Foreach relation on {1,2}, determine if it is reflexive, irreflexive, symmetric, asym- 
metric, antisymmetric, or transitive. 

(a) {(1,2)} 

(b) {C1,2),@, )} 

(c) {, 1), , 2), 2, 1)} 

(d) {(1, 1), 1, 2), (2,2)} 

(e) {(, 1), CL, 2), (2, 1), (2, 2)} 

(f) 2 


3. Give an example of a relation that is neither symmetric nor asymmetric. 


Section 4.3 PARTIAL ORDERS 187 


4. Let R be a relation on A. Prove that R is reflexive if and only if (A x A) \ R is 
irreflexive. 


5. Show that a relation R on A is asymmetric if and only if RN R7! = 2. 
6. Let (A, x) and (B, <) be posets. Define ~ on A x B by 

(a, b) ~ (a’, b’) if and only if a < a’ and b < DB’. 
Show that ~ is a partial order on A Xx B. 


7. For any alphabet A, prove the following. 
(a) Forallo,t,v € A*,o~(t7~ v) =(07 TT) v. 
(b) There exists o,t € A* such that o~t 4 to. 


(c) A%* has an identity with respect to ~, but for all o € A%*, there is no inverse 
foro ifo # 


8. Prove that A* from Example 4.3.7 is partially ordered by =. 
9. Prove that (P(A), C) is not a linear order if A has at least three elements. 
10. Show that (A*, <) is not a linear order if A has at least two elements. 


11. Prove that the following families of sets are chains with respect to C. 
(a) {[0,n] :neE Zt} 
(b) {QZ:n EN} where (2")Z = {2"-k:k € Z} 
(c) {B, :n © N} where B, = U{A; : i € NAi < n} and A; is a set for all 
ieéN 
12. Cana chain be disjoint or pairwise disjoint? Explain. 


13. Suppose that {A; : i € N} is a chain of sets such that for alli < j, Aj € A;. Prove 
for allk EN. 

(a) Uta :iENAI SK} =A, 

(b) (\tA :iENAI SK} =AgQ 
14. Let {A, : n € N} be a family of sets. For every m € N, define B,, = IB iar A;. 
Show that {B,, : m € N} is a chain. 


15. Let @, be a chain of the poset (A, <) for all i € I. Prove that () 
Is Ujer G; necessarily a chain? Explain. 


ier ©; iS a chain. 


16. Let (A, =) and (B, <) be linear orders. Define ~ on A x B by 
(a,b) ~ (a’, b’) 
if and only if 
a<a' ora=a' andb< J’. 


This relation is called a lexicographical order since it copies the order of a dictionary. 
(a) Prove that (A x B, ~) is a linear order. 
(b) Suppose that (a, b) is a maximal element of A x B with respect to ~. Show 
that a is a maximal element of A with respect to =. 


188 Chapter 4 RELATIONS AND FUNCTIONS 


17. Let A bea set. Prove that B C P(A) \ {@} is an antichain with respect to C if and 
only if B is pairwise disjoint. 
18. Prove the following true or false. 

(a) Every well-ordered set contains a least element. 

(b) Every well-ordered set contains a greatest element. 

(c) Every subset of a well-ordered set contains a least element. 

(d) Every subset of a well-ordered set contains a greatest element. 

(e) Every well-ordered set has a decreasing subset. 

(f) Every well-ordered set has a increasing subset. 


19. For each of the given sets, indicate whether or not it is well-ordered by <. If it is, 
prove it. If it is not, find a decreasing sequence of elements of the set. 


(a) {1/2,5,6, 10.56, 17,-100} 
(b) {2n:nEN} 

(c) {a/n:nEZ*} 

(d) {x/n:nEZ } 

(Ot Bt 

(f) Za (a,c) 

(g) ZN (-7, 0) 


20. Prove that (ZN (x, co), <) is a well-ordered set for all x € R. 
21. Show that a well-ordered set has a unique least element. 


22. Let (A, X) be a well-ordered set. If B C A and there is an upper bound for B in A, 
then B has a greatest element. 


23. Prove Theorem 4.3.29. 

24. Where does the proof of Theorem 4.3.31 go wrong if m = 0? 

25. Prove that if a | b, then a < b for all a, b € Z. 

26. Let a,b,c € Z. Show that if gcd(a,c) = gcd(b, c) = 1, then gcd(a, bc) = 1. 


27. Let a,b € Zand assume that gcd(a, b) = 1. Prove. 
(a) Ifa|nandb| xn, then ab | n. 
(b) gcd(a +b, b) = gcd(a+t b,a) = 1. 
(c) gced(a+b,a— b) = 1 or gcd(a+ b,a— b) =2. 
(d) Ifc | a, then gcd(b, c) = 1. 
(e) Ifc| a+b, then gcd(a,c) = ged(b,c) = 1. 
(f) Ifd|acandd | bc, thend | c. 


28. Let a,b € Z, where at least one is nonzero. Prove that S = T if 
S={x:dl[l€ZAx =1 gcd(a, b)}} 


and 
T ={x:x>0OA dudvtu,v € ZA x =ua+vb)}. 


Section 4.4 FUNCTIONS 189 


29. Prove that if d | aandd | b, then d | gcd(a, b) for all a,b,d € Z. 


4.4 FUNCTIONS 


From algebra and calculus, we know what a function is. It is a rule that assigns to each 
possible input value a unique output value. The common picture is that of a machine 
that when a certain button is pushed, the same result always happens. In basic algebra 
a relation can be graphed in the Cartesian plane. Such a relation will be a function if 
and only if every vertical line intersects its graphs at most once. This is known as the 
vertical line test (Figure 4.6). This criteria is generalized in the next definition. 


@ DEFINITION 4.4.1 


Let A and B be sets. A relation f C A x B is a function means that for all 
(x,y), (x,y) € AX B, 


ifx =x’, theny=y’. 


The function f is an n-ary function if there exists sets Ag, A;, ..., A,_, such that 
A=Aj)X A, X--:XA,_1. Ifn = 1, then f is a unary function, and if n = 2, 
then f is a binary function, 


Mi EXAMPLE 4.4.2 


The set {(1, 2), (4,5), (6,5)} 1s a function, but {(1, 2), (1,5), (6,5)} is not since it 
contains (1,2) and (1,5). Also, @ is a function (Exercise 16). 


Intersects at most once 


Figure 4.6 Passing the vertical line test. 


190 Chapter 4 RELATIONS AND FUNCTIONS 


EXAMPLE 4.4.3 


Define f = {(u, 1 +2cos zu) : u € R}. Assume that both (x, 1 + 2 cos zx) and 
(x’, 1 + 2cos zx’) are elements of f. Then, because cosine is a function, 


x=x' Sax =x! 
=> cos ax = cos ax! 
=> 2cos ax = 2cos ax! 
=> 1+2cosax =1+2cosax’. 


Therefore, f is a function. 


WM EXAMPLE 4.4.4 


The standard arithmetic operations are functions. For example, taking a square 
root is a unary function, while addition, subtraction, multiplication, and division 
are binary functions. To illustrate this, addition on Z is the set 


A= {((a,b),a+ 6b): a,b € Z}. (4.5) 


Let f C AX B bea function. This implies that for all a € A, either [a] f = Sor 
lal, is a singleton (Definition 4.2.7). For example, if f = {(1, 2), (4,5), (6,5)}, then 
[1], = {2} and [3] ¢ = @. Because lal ¢ can contain at most one element, we typically 
simplify the notation. 


@ DEFINITION 4.4.5 
Let f be a function. For all a € A, define 
f(a) = b if and only if [a], = {b} 
and write that f(a) is undefined if lal = 2. 


For example, using (4.5), 
A((5,7)) = 12, 


but A(5, z) is undefined. With ordered n-tuples, the outer parentheses are usually elim- 
inated so that we write 
A(5,7) = 12. 


Moreover, if D C A is the domain of the function f, write the function notation 
f[:D-B 


and call B a codomain of f (Figure 4.7). Because functions are often represented by 
arrows that “send” one element to another, a function can be called a map. If f(x) = y, 
we can say that “f maps x to y.” We can also say that y is the image of x under f and 
x is a pre-image of y. For example, if 


f = {G, 1), 4, 2),6,2)}, 


Section 4.4 FUNCTIONS 191 


Function name Codomain 


Figure 4.7 Function notation. 


f maps 3 to 1, 2 is the image of 4, and 5 is a pre-image of 2 (Figure 4.8). If g is alsoa 
function with domain D and codomain B, we can use the abbreviation 


f,.g:D—>B 


to represent both functions. An alternate choice of notation involves referring to the 
functions as D > B. 


EXAMPLE 4.4.6 
If f(x) = cosx, then f maps z to —1, 0 is the image of 2/2, and 2/4 is a 
pre-image of /2/ 2. 

WM EXAMPLE 4.4.7 


Let A be any set. The identity relation on A (Definition 4.1.2) is a function, so 
call I, the identity map and write 


T(x) =x. 
Sometimes a function f is defined using a rule that pairs an element of the function’s 
domain with an element of its codomain. When this is done successfully, f is said to 


be well-defined. Observe that proving that f is well-defined is the same as proving it 
to be a function. 


f 


Figure 4.8 The map f = {(3, 1), (4,2), (5, 2)}. 


192 Chapter 4 RELATIONS AND FUNCTIONS 


EXAMPLE 4.4.8 
Let f :R > R be defined by f(x) = 1+2cos ax. This is the function 


f ={(x%,1+2cosax):x ER}. 
The work of Example 4.4.3 shows that f is well-defined. 


Before we examine another example, let us set a convention on naming functions. 
It is partly for aesthetics, but it does help in organizing functions based on the type of 
elements in their domains and ranges. 


¢ Use English letters (usually, f, g, and h) for naming functions that involve num- 
bers. Typically, these will be lowercase, but there are occasions when we will 
choose them to be uppercase. 


e Use Greek letters (often g or y) for general functions or those with domains not 
consisting of numbers. They are also usually lowercase, but uppercase Greek 
letters like ® and Y are sometimes appropriate. (See the appendix for the Greek 
alphabet.) 


M@ EXAMPLE 4.4.9 


Let n,m € Z* such that m | n. Define ¢([a],,) = [a],, for all a € Z. This means 
that 
9? = {([a],, [a] n) : a € Z}. 


It is not clear that g is well-defined since an equivalence class can have many 
representatives, so assume that [a],, = [b], for a,b € Z. Therefore, n | a — b. 
Then, by hypothesis, m | a — b, and this yields 


PLA) ,) = Lam = [bm = CLE] ,)- 


M@ EXAMPLE 4.4.10 
Let x € R. Define the greatest integer function as 
|x]] = the greatest integer < x. 


For example, [[5]] = 5, [1.4] = 1, and [-3.4] = —4. The greatest integer 
function is a function R — Z. It is well-defined because the relation < well- 
orders Z (Exercise 4.3.22). 


If a relation on R is not a function, it will fail the vertical line test (Figure 4.9). To 
generalize, let @ € A x B. To show that ¢ is not a function, we must show that there 
exists (x, 1), (x, ¥2) € @ such that y, # yp. 


WM EXAMPLE 4.4.11 


The relation f = Ip U {(x,—x) : x € R} is not a function since (4,4) € f and 
4) et. 


Section 4.4 FUNCTIONS 193 


Figure 4.9 The relation is not a function. 


M@ EXAMPLE 4.4.12 


Define g € Z, x Z; by g([a],) = [a]; for all a € Z. Since 3 does not divide 2, 
¢@ is not well-defined. This is proved by noting that [0], = [2], but [0]; 4 [2]s. 


There will be times when we want to examine sets of functions. If each function is 
to have the same domain and codomain, we use the following notation. 


Hi DEFINITION 4.4.13 
If A and B are sets, 


AB= {g: gisa function A > B}. 


For instance, 4B is a set of real-valued functions if A, B C R. 


M@ EXAMPLE 4.4.14 


If a, is a sequence of real numbers with n = 0,1,2,..., the sequence a, is an 
element of NR. Illustrating this, the sequence 


ay, = (-1/2)" 


is a function and can be graphed as in Figure 4.10. 


M EXAMPLE 4.4.15 
Let A C Rand fix a € A. The evaluation map, 
€,:2 4A => A, 


is defined as 


eg) = f(@). 


194 Chapter 4 RELATIONS AND FUNCTIONS 


Figure 4.10 a, = (—1/2)" is a function. 


For example, if g(x) = x’, then €3(g) = 9. Observe that the evaluation map is an 
element of “4A. 


Equality 


Since functions are sets, we already know that two functions are equal when they con- 
tain the same ordered pairs. However, there is a common test to determine function 
equality other than a direct appeal to Definition 3.3.7. 

M@ THEOREM 4.4.16 


Functions g, y : A > B are equal if and only if g(x) = w(x) for all x € A. 


PROOF 
Sufficiency is clear, so to prove necessity suppose that g(x) = w(x) for every 
x € A. Take (a,b) € g. This means that g(a) = b. By hypothesis, yw(a) = 5, 
which implies that (a, b) € yw. Hence, g C g. The proof of yw C @ is similar, so 
g=g.0 

We use Theorem 4.4.16 in the next examples. 


M@ EXAMPLE 4.4.17 
Let f and g be functions R > R defined by 


f(x) = (x - 3)? +2 


and 
g(x) =x? - 6x4 11. 


We show that f = g by taking x € R and calculating 


f(x) = (x — 3)? +2 = (x? — 6x +9) 4.2 =x? —6x 411 = g(x). 


Section 4.4 FUNCTIONS 195 


Mi EXAMPLE 4.4.18 


Let g, yw : Z > Ze be functions such that 


p(n) = Ine 


and 
y(n) = [n+ 12]e. 
Take n € Z. We show that [n + 12],_ = [n]¢ by proceeding as follows: 


x € [nt 12], @ Ak[k EZAx =n 12+ 6k] 
& Ak[kKEZAx=n+62+kK)] 


ox eé[n¢. 
Therefore, gp = yw. 
M@ EXAMPLE 4.4.19 
Define . 
w:R + (CR)R 


by w(x) = e, for all x € R (Example 4.4.15). We show that y is well-defined. 
Take a,b € R and assume that a = b. To show that w(a) = w(b), we prove 
€, = €,. Therefore, let f € RR. Since f is a function and a,b € dom(f), we 
have that f(a) = f(b). Thus, 


Eg f) = f(a) = f(b) = €,(f). 


By Theorem 4.4.16, we see that two functions f and g are not equal when either 
dom(/) # dom(g) or f(x) # g(x) for some x in their common domain. For example, 
f(x) = x* and g(x) = 2x are not equal because f(3) = 9 and g(3) = 6. Although 
these two functions differ for every x # 0 and x ¥ 2, it only takes one inequality to 
prove that the functions are not equal. For example, if we define 


2 . 
pare x aa. 
7 ifx=0, 


then f # hsince f(0) = 0 and h(O) = 7. 


Composition 
We now consider the composition of relations when those relations are functions. 


M@ THEOREM 4.4.20 


Ifg@:A— Bandy: C — D are functions such that ran(g) C C, then y 0 9 is 
a function A > D and (yw o @)(x) = w(@(x)). 


196 Chapter 4 RELATIONS AND FUNCTIONS 


PROOF 
Because the range of ¢~ is a subset of C, we know that g C AXC andy C CxD. 
Let (a,d)),(a,d.) € wo. This means by Definition 4.1.9 that there exists 
€1,€) © C such that (a, c,), (a, cy) € @ and (c),d}),(cy,d,) € w. Since g is a 
function, c; = Cz, and then since y is a function, d, = dy. Therefore, y o gisa 
function, which is clearly A — D. Furthermore, 


ywoo= {(x,z): AyyECA(x%, VIE GALY,Z) EW]} 
= {(x,2z): dylyE CA Q(x) = yAw(y) = z]} 
= {(x,z):x € AAw(p(x)) = Zz} 
= {(x, w(@(x))) : x © A}. 
Hence, (y o @)(x) = w(g(x)). Hl 


The ran(g) € C condition is important to Theorem 4.4.20. For example, take the real- 
valued functions f(x) = x and g(x) = x. Since f(—1) = —1 but g(-1) ¢ R, we 
conclude that (g o f)(—1) is undefined. 


WM EXAMPLE 4.4.21 


Define the two functions f : R > Zand g: R \ {0} ~ R by f(x) = [x] and 
g(x) = 1/x. Since ran(f) = Z g dom(g), there are elements of R for which 
go f is undefined. However, 


ran(g) = R \ {0} C R = dom(/f), 
so f og is defined and for all x € R, 


(f 0 g(x) = f(g(x)) = f/x) = [1/>]- 


WM EXAMPLE 4.4.22 


Let p : Z7, — Z be defined by w(f) = €3(f) and also let g : Z > Z, be 
p(n) = [n]z. Since ran(w) € dom(¢@), g o y is defined. Thus, if g : Z > Zis 
defined as g(n) = 3n, 


(gp ow )(g) = P(W(g)) = O(g(3)) = GY) = [9]7 = [2]. 


We should note that function composition is not a binary operation unless both func- 
tions are A — A for some set A. In this case, function composition is a binary operation 
on 4A. 

Restrictions and Extensions 


There are times when a subset of a given function is required. For example, consider 


f ={(x,x7):x ER}. 


Section 4.4 FUNCTIONS 197 


If only positive values of x are required, we can define 
g = {(x,x°) : x € (0, 00)} 
so that g C f. We have notation for this. 


@ DEFINITION 4.4.23 
Let g: A > B bea function andC C A. 


¢ The restriction of g to C is the function g | C : C > B so that 


(p | C\(x) = g(x) for allx EC. 


e The function y : D —> E is an extension of g if A C D, B C E, and 
ywlA=@. 


Mi EXAMPLE 4.4.24 


Let f = {(1,2),(2,3),(3,4),(4, 1)} and g = {(1,2),(2,3)}. We conclude that 
g=/f { {1,2}, and f is an extension of g. 


EXAMPLE 4.4.25 
Let g: U > V bea function and A, B C U. We conclude that 


p (AU B)=(@| A)U(g] B) 
because 


(x,y) EQ (AUB) Sy=Q(xX)AxE AUB 
SyV=Q(xX)A(xEAVXE B) 
SyY=APXYAXECAVY=PXRAXEB 
e(x,yVEQGlAV(X, yEQl(B 
> (x,y) € (gl A)U(e[ B). 


Binary Operations 


Standard addition and multiplication of real numbers are functions R x R > R (Ex- 
ample 4.4.4). This means two things. First, given any two real numbers, their sum or 
product will always be the same number. For instance, 3+5 is 8 and never another num- 
ber. Second, given any two real numbers, their sum or product is also a real number. 
Notice that subtraction also has these two properties when it is considered an operation 
involving real numbers, but when we restrict substraction to Z*, it no longer has the 
second property because the difference of two positive integers might not be a positive 
integer. That is, subtraction is not a function Z* x Zt = Z*. 


198 Chapter 4 RELATIONS AND FUNCTIONS 


HM DEFINITION 4.4.26 


A binary operation « on the nonempty set A is a function A x A > A. 


The symbol that represents the addition function is +. It can be viewed as a function 
R — R. Therefore, using function notation, +(3,5) = 8. However, we usually write 
this as 3 + 5 = 8. Similarly, since « represents an operation like addition, instead of 
writing * (a, b), we usually write a x b, 

To prove that a relation « is a binary operation on A, we must show that it satisfies 
Definition 4.4.26. To do this, take a, a’, b, b’ € A, and prove: 


° a=a' andb=D! implies a * b= al x b’, 


e Ais closed under «, that isa* be A. 


EXAMPLE 4.4.27 
Define x * y = 2x — y and take a, a’, b,b’ € Z. 
e Assume a = a’ and b = b’. Then, 
axb=2a—b=2a'-b'=a' *b’. 


The second equality holds because multiplication and subtraction are bi- 
nary operations on Z. 


e Because the product and difference of two integers is an integer, we have 
that a * b € Z, so Z is closed under *. 


Thus, * is a binary operation on Z. 


M@ EXAMPLE 4.4.28 
Let S = {e,a,b,c} and define « by the following table: 


a FQ O/|*¥ 
QA FQ Vl 
oa emails 
gS nn oo 
os QQ 2 9/0 


The table is read from left to right, so b * c = a. The table makes « into a binary 
operation since every pair of elements of .S is assigned a unique element of S. 


Mi EXAMPLE 4.4.29 
Fix a set A. For any X,Y € P(A), define X * Y=XUY. 


e Let X,,X>,Y), ¥, © P(A). If we assume that X, = X, and Y, = Y5, we 
have X, UY, = X, UY. Hence, « is well-defined. 


Section 4.4 FUNCTIONS 199 


e To show that P(A) is closed under «, let B and C be subsets of A. Then 
Bx C= BUC € P(A) because BUC C A [Exercise 3.3.10(a)]. 


This shows that * is a binary operation on P(A). Notice that * is a subset of 
[P(A) x P(A)] x P(A). 


EXAMPLE 4.4.30 


Let m € Z* and define [a],, + [bl = [a + bln. We show that this is a binary 
operation on Z,,,. 


¢ Let a), a5,b,;,b) € Z. Suppose [a], = [ao], and [b)],,, = [b2],,. This 
means that a; = a) +nk and b; = b, + nl for some k,l € Z. Hence, 
a, +b; =a, +b, +n(k +1), and we have [a, + by|in = [ao + ba)n- 


¢ For closure, let [a],,, [b],, € Z,, where a and b are integers. Then, we have 
that [a],, + [5],, = [a+ 5],, € Z,, since a + b is an integer. 


Many binary operations share similar properties with the operations of + and x on R. 
The next definition gives four of these properties. 


@ DEFINITION 4.4.31 


Let « be a binary operation on A. 
e + is associative means that (a * b) * c =a * (b * c) forall a,b,c € A. 
e « is commutative means that a * b= 5b *« aforalla,b€ A. 


¢ The element e is an identity of A with respect to * when e € A ande x 
a=axe=aforallacaA. 


e Suppose that A has an identity e with respect to * and leta € A. The 
element a’ € A is an inverse of a with respect to * ifa * a! =a! xa=e. 


Notice that the identity, if it exists, must be unique. To prove this, suppose that 
both e and e’ are identities. These must be equal because e = e * e! = e’. So, if 
a set has an identity with respect to an operation, we can refer to it as the identity of 
the set. Similarly, we can write the inverse if it exists for associative binary operations 
(Exercise 20). 


EXAMPLE 4.4.32 


We assume that + and x are both associative and commutative on C and all sub- 
sets of C, that 0 is the identity with respect to + (the additive identity) and 1 is the 
identity with respect to x (the multiplicative identity), and that every complex 
number has an inverse with respect to + (an additive inverse) and every nonzero 
complex number has an inverse with respect to x (a multiplicative inverse). 


200 Chapter 4 RELATIONS AND FUNCTIONS 


EXAMPLE 4.4.33 


The binary operation defined in Example 4.4.30 is both associative and commu- 
tative. To see that it is commutative, let a,b € Z. Then, 


[a],,+ [bly = (a+ 5], = [b+ a],, = [4], + [an (4.6) 


where the second equality holds because + is commutative on Z. Its identity is 
[0],,, [let b = 0 in (4.6)] and the additive inverse of [a],,, is [—a] 


he 
WM EXAMPLE 4.4.34 


Since AU B= BUAand (AU B)UC =AU(BUC) for all sets A, B, and C, 
the binary operation in Example 4.4.29 is both associative and commutative. Its 
identity is @, and only @ has an inverse. 


Exercises 


1. Indicate whether each of the given relations are functions. If a relation is not a 
function, find an element of its domain that is paired with two elements of its range. 
(a) {(1, 2), 2, 3), 3,4), (4,5), (5, D} 
(b) {C, 1), d, 2), 1, 3), 1,4), C, 5)} 
(©) {(x, V|x]:x ER} 
(@) {,ty]x1):x€R} 
(e) {(x,x?):x ER} 
(f) {([a]5,b): dk € Z(a = 6 +5k)} 
(g) w: ZZ; if y(a) = [a]s 
2. Prove that the given relations are functions. 
(a) {(x,1/x): x ER \ {0}} 
(b) {(x,x+1):x EZ} 
(c) {(x, |x) :x € R} 
(d) {(x, Vx): x € [0,c0)} 


3. Let f = {(xyXy€ R? : 2x4 y = 1}. Show that f is a function with domain and 
codomain equal to the set of real numbers. 


4. Let f,g : R > R be functions. Prove that g(x, y) = (f(x), g(y)) is a function with 
domain and codomain equal to R x R. 


5. Let A be a set and define y(A) = P(A). Show that y is a function. 


6. Define 
x2 ifx>0, 
fon ¥3 ifx <0. 


Show that f is well-defined with domain equal to R. What is ran(f)? 


Section 4.4 FUNCTIONS 201 


7. Let x be in the domain and y in the range of each relation. Explain why each of the 
given equations does not describe a function. 

(a) y=54x 

(b) x*7+y=1 

(c) x=4y-1 

@ y-x=9 


8. Let gp: Z > Z, be defined by g(a) = [a]7. Write the given images as rosters. 
(a) (0) 
(b) (7) 
(c) (3) 
(d) g(-3) 


9. Define y((a],,) = [a], for alla € Z. Is g being function sufficient for m | n? 
Explain. 


10. Give an example of a function that is an element of the given sets. 
(a) RR 
(b) ®Z 
(c) NR 
(d) 80, 00) 
(e) 4(Zs) 
11. Evaluate the indicated expressions. 
(a) e4(f) if f(x) =9x +2 
(b) e,(g) if g(@) = sin@ 


12. Since functions are sets, we can perform set operations on them. Let f(x) = x 
and g(x) = —x. Find the following. 

(a) fug 

(b) fng 

cc) f\g 

(dd) s\f 


13. Let @ be a chain of functions with respect to C. Prove that (J @ is a function. 


2 


14. Let f and g be functions. Prove the following. 

(a) If f and g are functions, f N g is a function. 

(b) / Ug isa function if and only if f(x) = g(x) for all x € dom(f) nN dom(g). 
15. Let f : A > B bea function. Define a relation S on A by aS b if and only if 
f(a) = f(b). 

(a) Show S is an equivalence relation. 

(b) Find [3], if f : Z — Zis defined by f(n) = 2n. 

(c) Find [2], if f : Z > Z, is given by f(n) = [n]s. 


16. Show that @ is a function and find its domain and range. 


202 Chapter 4 RELATIONS AND FUNCTIONS 


17. Define * by x * y=x+y+2 forall x,y € Z. 
(a) Show that * is a binary operation on Z. 
(b) Prove that —2 is the identity of Z with respect to «. 
(c) For every n € Z, show that —n — 4 is the inverse of n with respect to x. 


18. Define the binary operation * by x * y= 2x — y forall x,y € Z. 
(a) Is there an integer that serves as an identity with respect to «? 
(b) Does every integer have an inverse with respect to *? 


19. Let f,g : R > R be functions. Prove that f o g is well-defined. 


20. For an associative binary operation, prove that the inverse of an element is unique 
if it exists. Show that this might not be the case if the binary operation is not associative. 


21. Prove that the given pairs of functions are equal. 

(a) f(x) = (x — 1)(x — 2)(x + 3) and g(x) = x7 —7x + 6 
where f,g:R-R 

(b) g(a,b)=a+bandy(a,b)=b+a 
where g,yw:Z2xXZ>Z 

(c) (a, b) = (Lals, [6 + 7]5) and w(a, b) = (La + 5]5, [b — 3]5) 
where g,yw : ZXZ — Zs X Zs 

(d) pf) =f [Zand y(f) = {@, f(): 2 € Z} 
where gv, yw :®R > ZR 


22. Show that the given pairs of functions are not equal. 
(a) f(x) =x and g(x) = 2x where f,g:R—-R 
(b) f(x) =x-—3 and g(x) =x+3 where f,g:R-R 
(c) g(a) = [a]; and y(a) = [a], where 9, yw: Z > Z, UZ; 
(d) (A) =A \ {0} and yw(A) = An {1, 2,3} where g, y : P(Z) > P(Z) 


23. Let wy : R > RR be defined by y(a) = Jf, where f, is the function f,: R > R 
with f(x) = ax. Prove that y is well-defined. 


24. For each pair of functions, find the indicated values when possible. 

(a) f:R—- Rand f(x) = 2x? 
g:R- Rand g(x)=x+4+1 
(f 0 g)(2) 

(g 0 f)(0) 

(b) f: [0,00) > Rand f(x) = Vx 
g:R- Rand g(x) = |x|-1 
(f 0 g)(0) 

(go f)(4) 

(c) g:Z-— Z, and y(a) = [a]; 
w:®R-= Rand y(f) = f) 
(po yw)(.5x + 1) 

(yw 0 p)(2) 


Section 4.5 INJECTIONS AND SURJECTIONS 203 


25. For each of the given functions, find the composition of the function with itself. 
For example, find f o f for part (a). 

(a) f :R— Rwith f(x) =x? 

(b) g:R—- Rwith g(x) =3x+1 

(Cc) 9: Z2XZ—>ZxZwith g(x, y) = (2y,5x — y) 

() Wt Zn Ly with (Ln) ,) = [n+ lp 


26. Letg: A > Bbea function and y = g| C whereC C A. Prove thatifr: C > A 
is defined by 1(c) = c (known as the inclusion map), then yw = g 01. 


27. Write the given restrictions as rosters. 
(a) {(1,2), (2,2), (3, 4), 4,7} 1 {1,3} 
(b) f [ {0,1,2,3} where f(x) = 7x — 1 anddom(f) =R 
(c) (g +h) } {-3.3, 1.2,7} where g(x) = [x], A(x) = x + 1, and both dom(g) 
and dom(h) equal R 


28. For functions f and g such that A, B C dom(/), prove the following. 

(a) f[A=fn[Axran(f)] 

(b) FI ANB)=(f [ANS TB) 

() fl A\B=(FTA\STB) 

(d) (go f)[A=go(f/A) 
29. Let f : U > V bea function. Prove that if AC U, then f [A= fol,. 
30. Let p: 4C — ®C be defined by y(f) = f | B. Prove that ¢ is well-defined. 


31. A real-valued function f is periodic if there exists k > 0 so that f(x) = f(x +k) 
for all x € dom(/). Let g,h : R > R be functions with period k. Prove that g o h is 
periodic with period k. 


32. Let (A, <) be a poset. A function g : A > A is increasing means for all x, y € A, 
if x < y, then g(x) < g(y). A decreasing function is defined similarly. Suppose that 
o and 7 are increasing. Prove that o © 7 is increasing. 


4.5 INJECTIONS AND SURJECTIONS 


When looking at relations, we studied the concept of an inverse relation. Given R, ob- 
tain R~! by exchanging the x- and y-coordinates. The same can be done with functions, 
but the inverse might not be a function. For example, given 


f = {(, 2), (2, 3), 3, 2)}, 


its inverse is 
(PSIG ANG 2), 234, 


However, if the original relation is a function, we often want the inverse also to be a 
function. This leads to the next definition. 


204 Chapter 4 RELATIONS AND FUNCTIONS 


Mi DEFINITION 4.5.1 


1 


g:A-— Bis invertible means that g~ is a function B > A. 


An immediate consequence of the definition is the next result. 
@ LEMMA 4.5.2 
Let g be invertible. Then, g(x) = yif and only if g~!(y) = x for all x € dom(9). 


PROOF 
Suppose that g(x) = y. This means that (x,y) € g, so (y,x) € g@ 
9 is invertible, g7! 
similarly. 


1 Since 


is a function, so write p~!(y) = x. The converse is proved 


We use Lemma 4.5.2 in the proof of the next theorem, which gives conditions for 
when a function is invertible. 


M@ THEOREM 4.5.3 


gp: A= Bis invertible if and only if p-! og =I, andgog!= Ip. 
PROOF 


Take a function g : A > B. Then, p=! C BX A. 


¢ Assume that gy! is a function B > A. Let x € A and y € B. By assump- 


tion, we have xy € A and yy € B such that w(x) = yo and gy) = Xo. 
This implies that e (Yo) = x and g(x) = y by Lemma 4.5.2. Therefore, 


1 


(go! 0 g\(x) =o '(e(x)) = 9 '() = x 


and 
(p09 '\y) = Go '(y)) = G(X) = Y. 


1 1 1 


og =I, andgo@q™ = Ip. To show that g™ isa 
-! From this, we know that (x,y) € 9. 


e« Now assume g~ 
function, take (y, x), (y, x’) € @ 
Therefore, 

1 


(x, x’) € gy op=Ty, 


so x = x’. In addition, we know that dom(g7!) C B, so to prove equality, 
let y € B. Then, 
(Vy Elp=pog'. 


1 


Thus, there exists x € A such that (y, x) € g~!, so y € dom(g7!). 


M@ EXAMPLE 4.5.4 


e Let f : R > R be the function given by f(x) = x +2. Its inverse is g(x) = x-2 
by Theorem 4.5.3. This is because 


(g 0 f)(x) = g(x +2) =(x+2)-2=x 


Section 4.5 INJECTIONS AND SURJECTIONS 205 


and 
(f o g(x) = f(x -2)=(x-2)+2=x. 


- If the function g : [0,00) —> [0, 00) is defined by g(x) = x7, then g~! isa 
function [0, co) — [0, co) and is defined by g(x) = /x. 


e Leth: R = (0, 00) be defined as h(x) = e*. By Theorem 4.5.3, we know that 
h-!(x) = Inx because e!"* = x for all x € [0, 00) and Ine* = x for allx ER. 


Injections 


Theorem 4.5.3 can be improved by finding a condition for the invertibility of a function 
based only on the given function. Consider the following. In order for a relation to be a 
function, it cannot look like Figure 4.9. Since the inverse exchanges the roles of the two 
coordinates, in order for an inverse to be a function, the original function cannot look 
like the graph in Figure 4.11. In other words, if f~! is to be a function, there cannot 
exist x, and x, so that x; # x» and f(x,) = f(x). But then, 


TAX) Ax 9[x1 F X2 A f(%1) = F(%9)] 
is equivalent to 

Wx {Vx9[x] = XV (2X1) F f(%d)], 
which in turn is equivalent to 

Vx, Vxo[f (x1) = f(x.) 9 Xy = Xo]. 


Hence, f being an invertible function implies that for every x,,x. € dom(/), 


Xx, = Xp if and only if (x1) = f(x). (4.7) 


Figure 4.11 The inverse of a function might not be a function. 


206 Chapter 4 RELATIONS AND FUNCTIONS 


9 


Figure 4.12 is a one-to-one function. 


This means that the elements of the domain of f and the elements of the range of f form 
pairs of elements as illustrated in Figure 4.12. The sufficiency of (4.7) is the definition 
of a function, while necessity is the next definition. 


@ DEFINITION 4.5.5 
The function g : A > B is one-to-one if and only if for all x), x» € A, 
if p(x,) = G(x), then x; = Xp. 


A one-to-one function is sometimes called an injection. 


M EXAMPLE 4.5.6 


Define f : R > Rby f(x) = 5x-+1. To show that f is one-to-one, let x;,x, ER 
and assume f(x,) = f(x). Then, 


5x, +1 =5x.+ 1, 
5x1 = 5x9, 


X, =X. 
EXAMPLE 4.5.7 
Letg:ZxZ— ZxZx Z be the function 
@p(a, b) = (a, b, 0). 
For any (4, 5,), (a), b,) € ZX Z, assume 
e(a), by) = (a, by). 


This means that 
(a), by, 0) = (a, by, 0). 


Hence, a, = a, and b, = by, and this yields (a), b,) = (ap, bo). 


Section 4.5 INJECTIONS AND SURJECTIONS 207 


yp 


came 


Figure 4.13 q is not a one-to-one function. 


If a function is not one-to-one, there must be an element of the range that has at 
least two pre-images (Figure 4.13). An example of a function that is not one-to-one is 
f(x) = x? where both the domain and codomain of f are R. This is because f(2) = 4 
and f(—2) = 4. Another example is g : R — R defined by g(0) = cos @. It is not an 
injection because g(0) = g(2z) = 1. 

Although the original function might not be one-to-one, we can always restrict the 
function to a subset of its domain so that the resulting function is one-to-one. This is 
illustrated in the next two examples. 


Mi EXAMPLE 4.5.8 


Let f be the function {(1, 5), (2, 8), (3, 8), (4, 6)}. We observe that f is not one- 
to-one, but both 


fT {1,2} = (0,5), (2, 8)} 


and 


Ff 1 {3,4} = {G, 8), 4, 6)} 


are one-to-one as in Figure 4.14. 


ftB 


Figure 4.14 Restrictions of f to A and B are one-to-one. 


208 Chapter 4 RELATIONS AND FUNCTIONS 


M@ EXAMPLE 4.5.9 


Let g : R > R be the function g(x) = x”. This function is not one-to-one, but 
g | [0, co) and g | (—10, —5) are one-to-one. 


Let f(x) = 3x + 6 and g(x) = 5x — 8. Both are injections. Notice that 
(f 0 g)(x) =3V5x-8+4+6 


(go f)(x) = V15x +22 


are also injections. We can generalize this result to the next theorem. 


and 


@ THEOREM 4.5.10 


Ifg:A-— Bandy: B- C are injections, y 0 @ is an injection. 


PROOF 
Assume that g : A > Bandy: B — C are one-to-one. Let a,,a, € A and 


assume (y 0 ~)(a,) = (y ° ~)(ay). Then, 
W(9(41)) = y(P(ap)). 


Since y is one-to-one, 
P(41) = PCa), 


and since ¢@ is one-to-one, a, = a. 


Surjections 


The function being an injection is not sufficient for it to be invertible since it is pos- 
sible that not every element of the codomain will have a pre-image. In this case, the 
codomain cannot be the domain of the inverse. To prevent this situation, we will need 
the function to satisfy the next definition. 


@ DEFINITION 4.5.11 


A function g : A > B is onto if and only if for every y € B, there exists x € A 
such that g(x) = y. An onto function is also called a surjection. 


This definition is related to the range (or image) of the function. The range of the 
function g: A > Bis 


ran(p) = {y : dx(x € AA (x, y) € @)} = {e(x) : x € dom(g)} 
as illustrated in Figure 4.15. Thus, g is onto if and only if ran(g) = B. 
WM EXAMPLE 4.5.12 


Define f : [0, 00) — [0, co) by f(x) = x. Its range is also [0, co), so it is onto. 


Section 4.5 INJECTIONS AND SURJECTIONS 209 


Figure 4.15 The range of g: A > B with y = g(x). 


WM EXAMPLE 4.5.13 


The ranges of the following functions are different from their codomains, so they 
are not onto. 


e Let g: R > R be defined by g(x) = |x|. Then, ran(g) = [0, co). 
e Define h: Z > Z by h(n) = 2n. Here, ran(h) = {2n:n € Z}. 


The functions illustrated in Figures 4.12, 4.13, and 4.14 are onto functions as are 
those in the next examples. 


M@ EXAMPLE 4.5.14 


Any linear function f : R > R that is not a horizontal line is a surjection. To 
see this, let f(x) = ax + b for some a # 0. Take y € R. We need to find x € R 
so that ax + b = y. Choose 


Then, 
fos) = a(2=*) +b=y. 


The approach in the example is typical. To show that a function is onto, take an 
arbitrary element of the codomain and search for a candidate to serve as its pre-image. 
When found, check it. 


WM EXAMPLE 4.5.15 


Take a positive integer m and let p : Z > Z,, be defined as p(k) = [k],,. To see 
that @ is onto, take [/],, € Z,, for some / € Z. We then find that g(/) = [/],,. 


M@ EXAMPLE 4.5.16 


Let m,n € N with m > n. A function z : R” > R” defined by 


F(X. Xo eee Xp po Nye eee Xm) = (Xs Xo 000s Xp__y) 


210 Chapter 4 RELATIONS AND FUNCTIONS 


is called a projection.. Such functions are not one-to-one, but they are onto. For 
instance, define z: Rx Rx R — RXR by 


m(X, y, Z) = (x, y). 


It is not one-to-one because z(1, 2,3) = z(1, 2,4) = (1,2). However, if we take 
(a, b) € RXR, then z(a, b, 0) = (a, b), so z is onto. 


If a function is not onto, it has a diagram like that of Figure 4.16. Therefore, to show 
that a function is not a surjection, we must find an element of the codomain that does 
not have a pre-image. 


WM EXAMPLE 4.5.17 


Define f : Z > Zby f(n) = 3n. This function is not onto because 5 does not 
have a pre-image in Z. 


M@ EXAMPLE 4.5.18 


The function 
9: Z2xXZ>ZxZxZ 


defined by g(a, b) = (a, b,0) is not onto because (1, 1, 1) does not have a pre- 
image in Z x Z. 


We have the following analog of Theorem 4.5.10 for surjections. 


M@ THEOREM 4.5.19 


Ifg:A— Bandy: B - C are surjections, y © @ is a surjection. 


PROOF 
Assume that g: A ~ Bandy: B — C are surjections. Take c € C. Then, 
there exists b € B so that w(b) = c anda € A such that g(a) = b. Therefore, 


(wo g)(a) = w(P(a)) = w(b) = c. 


Figure 4.16 ¢@ is not a onto function. 


Section 4.5 INJECTIONS AND SURJECTIONS 211 


Bijections 


If we have a function that is both one-to-one and onto, then it is called a bijection or a 
one-to-one correspondence. Observe that @ is a bijection. 


EXAMPLE 4.5.20 


As illustrated in Examples 4.5.6 and 4.5.14, every linear f : R — R with nonzero 
slope is a bijection. 


WM EXAMPLE 4.5.21 


Both g : (-2/2,2/2) —> R such that g(@) = tan@ and h : R > (0, 0) where 
h(x) = e* are bijections. 


We are now ready to give the standard test for invertibility. Its proof requires both 
Lemma 4.5.2 and Theorem 4.5.3. The benefit of this theorem is that it provides a test 
for invertibility in which the given function is examined instead of its inverse. 


@ THEOREM 4.5.22 


A function is invertible if and only if it is a bijection. 


PROOF 
Let 9: A > B bea function. 


¢ Suppose that ¢ is invertible. To show that @ is one-to-one, let x,,x, € A 
and assume that g(x,) = p(x>). Then, by Theorem 4.5.3, 


x, = @ (P(X) = @ | (G(XQ)) = Xp. 


To see that g is onto, take y € B. Then, there exists x € A such that 
go '(y) = x. Hence, g(x) = y by Lemma 4.5.2. 


¢ Assume that g is both one-to-one and onto. To show that y7! is a function, 


let (y, x), (y, x’) € g~!. This implies that g(x) = y = g(x’). Since ¢ is 
one-to-one, x = x’. To prove that the domain of g~! is B, take y € B. 
Since @ is onto, there exists x € A such that g(x) = y. By Lemma 4.5.2, 
we have that g~!(y) = x, so y € dom(g7!). 


By Theorem 4.5.22 the functions of Examples 4.5.20 and 4.5.21 are invertible. 


M@ THEOREM 4.5.23 


1 


Ifg@:A— Bandy: B- C are bijections, then g * and wy o @ are bijections. 


PROOF 
Suppose @ is a bijection. By Theorem 4.5.22, it is invertible, so mp" is a function 
that has @ as its inverse. Therefore, g~! is a bijection by Theorem 4.5.22. Com- 
bining the proofs of Theorems 4.5.10 and 4.5.19 show that y o @ is a bijection 
when y is a bijection. 


1 


212 Chapter 4 RELATIONS AND FUNCTIONS 


Using the functions g and h from Example 4.5.21, we conclude from Theorem 4.5.23 
that h o g is a bijection with domain (—z/2, 2/2) and range (0, oo). 
Order Ilsomorphims 


Consider the function 
g:Zx {0} > {0} xZ (4.8) 


defined by g(m, 0) = (0, m). It can be shown that ¢ is a bijection (compare Exercise 12). 
Define the linear orders < on Z x {0} and =’ on {0} x Z by 


(m, 0) = (a, 0) if and only if m <n 
and 
(0, m) =’ (0, n) if and only if m < n. 


Notice that (3,0) < (5,0) and (0,3) =<’ (0,5) because 3 < 5. We can generalize this to 
conclude that 


(m, 0) < (n, 0) if and only if (0, m) =’ (0, n), 
and this implies that 
(m,0) = (n, 0) if and only if g(m, 0) =<’ e(n, 0). 
This leads to the next definition. 


HM DEFINITION 4.5.24 


Let R be arelation on A and SS be a relation on B. 


¢ g: A= Bis an order-preserving function if for all a,,a, € A, 
(a,, a7) € R if and only if (p(a,), p(az)) € S 


and we say that g preserves R with S. 
¢ An order-preserving bijection is an order isomorphism. 


e (A, R) is order isomorphic to (B,S') and we write (A, R) = (B,S) if 
there exists an order isomorphism g : A — B preserving R with S. If 
(A, R) = (B, S) and the relations R and S are clear from context, we can 
write A = B. Sometimes (A, R) and (B, S) are said to have the same order 
type when they are order isomorphic. 


An isomorphism pairs elements from two sets in such a way that the orders on the two 
sets appear to be the same. 


Section 4.5 INJECTIONS AND SURJECTIONS 213 


WM EXAMPLE 4.5.25 


Define f : R* > R™ by f(x) = —x (Example 3.1.12). Clearly, f is a bijection. 
Moreover, f preserves < with >. To prove this, let x;,x> € R™. Then, 


Xy SX) SX] SX S F(X) = F(X). 
Therefore, (Rt, <) = (R™, >). 
Observe that the inverse of (4.8) is the function 
go! :{0}xZ—> Zx {0} 


such that og '(0, m) = (m,0). This function preserves <’ with <. This result is gener- 
alized and proved in the next theorem. 


M@ THEOREM 4.5.26 


The inverse of an order isomorphism preserving R with S is an order isomor- 
phism preserving S' with R. 


PROOF 

Let gp : A — B be an order isomorphism preserving R with S. By Theo- 
rem 4.5.23, y7! is a bijection. Suppose that (b,,b,) € S. Since ¢ is onto, 
there exists a,,a, © A such that g(a,) = 5, and g(a.) = b,. This implies 
that (p(a)), p(ay)) € S. Since @ is an isomorphism preserving R with S, 
we have that (a,,a,) € R. However, a, = g~!(b,) and ay = (bp), so 
(p7!(by), ge |(bo)) € R. Therefore, y~! : B > A is an isomorphism preserving 
S with Rk. @ 


Also, observe that g : R — (0,00) defined by g(x) = e* is an order isomorphism 
preserving < with <. Using f from Example 4.5.25, the composition f 0 g is an order 
isomorphism R —> (—oco,0) such that (f o g)(x) = —e*. That this happens in general is 
the next theorem. Its proof is left to Exercise 25. 


M@ THEOREM 4.5.27 
If g : A > Bis an isomorphism preserving R with S and y : B > C is an 
isomorphism preserving S with T, then y o g : A — C is an isomorphism 
preserving R with T. 

M@ EXAMPLE 4.5.28 


Let R be a relation on A, S bea relation on B, and T be a relation on C. 
e Since the identity map is an order isomorphism, (A, R) & (A, R). 


e Suppose (A, R) = (B,.S’). This means that there exists an order isomor- 
phism g : A > B preserving R with S. By Theorem 4.5.26, g7! is an 
order isomorphism preserving S' with R. Therefore, (B, S) = (A, R). 


214 Chapter 4 RELATIONS AND FUNCTIONS 


e Let (A, R) = (B, S) and(B, S) = (C,T). By Theorem 4.5.27, we conclude 
that (A, R) = (C,T). 


If an order-preserving function is one-to-one, even it is not a surjection, the function 
still provides an order isomorphism between its domain and range. This concept is 
named by the next definition. 


@ DEFINITION 4.5.29 


g:A-— Bis an embedding if ¢ is an order isomorphism A —> ran(@). 


For example, f : Z — Q such that f(m) = n is an embedding preserving < and 
a: R? > R? such that w(x, y) = (x, y, 0) is an embedding preserving the lexicograph- 
ical order (Exercise 4.3.16). Although R’ is not a subset of R*, we view the image of 
yw as a copy of R? in R? that preserves the orders. 


Exercises 


1. Show that the given pairs of functions are inverses. 


(a) f(x) = 3x +2 and g(x) = 4x = 


(b) (a, b) = (2a, b + 2) and w(a, b) = (5a, b-2) 
(c) f(x) =a* and g(x) = log, x, where a > 0 


2. For each function, graph the indicated restriction. 
(a) f 1,00), f(x) = x? 
(b) g T[-5,—-2], g(x) = |x| 
(c) Aft [0, 2/2], A(x) = cos x 


3. Prove that the given functions are one-to-one. 
(a) f: ROR, f(x%) =2x4+1 
(b) g:R° > R’, g(x, y) = By, 2x) 
(c) h: R\ {9} > R\ {0}, A(x) = 1/9) 
(d) 9: ZxXR- Zx (0,00), p(n, x) = Bn, e*) 
(e) yw: P(A) > P(B), w(C) = CU {5} where AC Bandbe B\A 


4. Let f : (a,b) > (c,d) be defined by 


d-c 


f(x) =o 


(x-—a)+c. 


Graph f and show that it is a bijection. 


5. Let f and g be functions such that ran(g) € dom(/). 
(a) Prove that if f o g is one-to-one, g is one-to-one. 
(b) Give an example of functions f and g such that f o g is one-to-one, but f is 
not one-to-one. 


6. Define g : Z > Z,, by g(k) = [K],,. Show that @ is not one-to-one. 


Section 4.5 INJECTIONS AND SURJECTIONS 215 


7. Show that the given functions are not one-to-one. 
(a) f: ROR, f(x) =x*+4+3 
(b) g: ROR, g(x) =|x-2|+4 
(c) go: P(A) -> {{a}, 2}, p(B) = Bn {a}, where a € A and A has at least two 
elements 


(d) e5: FR > R,es(f) = £6) 
8. Let f : R — R be periodic (Exercise 4.4.31). Prove that f is not one-to-one. 


9. Show that the given functions are onto. 
(a) f: ROR, f(x) =2x4+1 
(b) g:R— (0,0), g(x) = e* 
(c) h:R\ {0} >R\ {0}, h(x) = 1/x 
(d) 9: Z2XZ—>Z, p(a,b)=a+b 
(e) €5:®R> R,es(f) = f() 
10. Show that the given functions are not onto. 
(a) f: ROR, f(x) =e" 
(b) g:R->R, g(x) = |x| 
(c) 9: ZxXZ— ZxZ, o(a,b) = Ba, b) 
(d) wy: R>8R, w(@) = f, where f(x) =a forallx ER 
11. Let f and g be functions such that ran(g) C dom(/). 
(a) Prove that if f o g is onto, then f is onto. 


(b) Give an example of functions f and g such that f o g is onto, but g is not 
onto. 


12. Define g: Qx Z > Zx Q by G(x, y) = (y, x). Show that is a bijection. 
13. Show that the function y : A x B > C x D defined by 


y(a, b) = (p(@), w(b)) 


is a bijection if both @ : A > Candy: B > Dare bijections. 
14. Define y : Ax (BxC) > (Ax B) x C by 


y(a, (b, c)) = ((a, 6), ¢). 


Prove y is a bijection. 
15. Demonstrate that the inverse of a bijection is a bijection. 
16. Prove that the empty set is a bijection with domain and range equal to 2. 


17. Let A C Rand define g : 8R > 4R by of) = f | A. Is v always one-to-one? Is 
it always onto? Explain. 


18. A function f : A > B has a left inverse if there exists a function g : B > A such 
that g o f = I,. Prove that a function is one-to-one if and only if it has a left inverse. 


216 Chapter 4 RELATIONS AND FUNCTIONS 


19. A function f : A > B has a right inverse if there exists a function g : B > Aso 
that f o g = Ip. Prove a function is onto if and only if it has a right inverse. 


20. Let (A, <) be a poset. For every bijection g : A — A, @ is increasing (Exer- 
cise 4.4.32) if and only if g~! is increasing. 


21. Show that if a real-valued function is increasing, it is one-to-one. 


22. Prove or show false this modification of Theorem 4.5.23: If @ : A — B and 
yw:C > Dare bijections with ran(g) C C, then y o @ is a bijection. 


23. Let A, B C R be two sets ordered by <. Let A be well-ordered by <. Prove that if 
f : A — Bis an order-preserving surjection, B is well-ordered by <. 


24. Let @: A > B be an isomorphism preserving R with R’. LetC C Aand DC B. 
Prove the following. 

(a) (C,RN[C x C]) = @IC], R’ NI IC] x g[C]) 

(b) (D,R’N[D x D]) = (g! [D], Ra [e! [PD] xg" [D])) 


25. Prove Theorem 4.5.27. 


26. Define f : R > R by f(x) = 2x 4+ 1. Prove that f is an order isomorphism 
preserving < with <. 


27. Suppose that (A, R) and (B, S') are posets. Let g@ : A — B be an order isomor- 
phism preserving R with S and C C A. Prove that if m is the least element of C with 
respect to R, then g(m) is the least element of @[C] with respect to S. 


28. Find linear orders (A, <) and (B, x’) such that each is isomorphic to a subset of 
the other but (A, <) is not isomorphic to (B, x’). 


29. Let (A, x) be a poset. Prove that there exists B C P(A) such that (A, =) = (B,C). 


4.6 IMAGES AND INVERSE IMAGES 


So far we have focused on the image of single element in the domain of a function. 
Sometimes we will need to examine a set of images. 


@ DEFINITION 4.6.1 
Let g: A > Bbea function and C C A. The image of C (under g) is 
PIC] = (p(x): x EC}. 


Notice that g [C] C B (Figure 4.17) and @ [A] = ran(¢). 
A similar definition can be made with subsets of the codomain. 


@ DEFINITION 4.6.2 
Let g: A > Bbea function and D C B. The inverse image of D (under @) is 
og! [D] = {x € A: g(x) € D}. 


Section 4.6 IMAGES AND INVERSE IMAGES 217 


Figure 4.17 The image of C under @. 


Figure 4.18 The inverse image of D under ¢. 


Observe that p=! [D] C A (Figure 4.18) and g7' [ran(~)] = A. 


Mi EXAMPLE 4.6.3 


Let f = {(1, 2), (2, 4), (3, 5), (4, 5)}. This set is a function. Its domain is {1, 2, 3, 4}, 
and its range is {2,4,5}. Then, 


FUL 33] = {2,5} 


and 


f"T5}] = (3,4). 
Ml EXAMPLE 4.6.4 


Define f : R > R by f(x) =x? +1. 


e To prove f [(1,2)] = (2,5), we show both inclusions. Let y € f [(, 2)]. 
Then, y = x2 + 1 for some x € (1,2). By a little algebra, we see that 
2<x*+1<°5. Hence, y € (2,5). Conversely, let y € (2,5). Because 


2<y<5el<y-1<49e1<vVy-1<2, 


218 Chapter 4 RELATIONS AND FUNCTIONS 


(a) FLU. 2)] = 2,5) (b) f-'[2,5)] = (-2,- uC, 2) 


Figure 4.19 The image and inverse image of a set under /. 


Vy—1€ (1,2). Furthermore, 


fvy-D=fy-D=(Vy- 1 +1 =. 
Therefore, y € f [(1,2)]. This is illustrated in Figure 4.19(a). 


e Simply because f [(1, 2)] = (2,5), we cannot conclude f —11(2,5)] equals 
(1,2). Instead, f~! [(2,5)] = (—2,-1) u (1, 2), as seen in Figure 4.19(b), 
because 

x €(-2,-1)UC,,2) @ —-2<x<-lorl<x<2 
el<x<4 
@e2<x+4+1<5 
= f(x) € @,5) 
exe f [25]. 


¢ To show that f~! [(—2,—1)] is empty, take x € f~![(—2,-1)]. This 
means that —2 < f(x) < —1, but this is impossible because f(x) is posi- 
tive. 


Let f = {(1,3), (2, 3), 3,4), (4,5)}. This is a function {1,2,3,4} > {3,4,5}. 


Notice that 
FUT} U {2,3}] = {3,4} = FUT Y SUZ, 341, 


FUL 23.9 (3}] = fle] = 9 ¢ {3} = FUL FLt2, 331, 
f"[(3,4} U (5}] = (1,2, 3,4} = £713, 441 FLESH 


Section 4.6 IMAGES AND INVERSE IMAGES 219 


and 
f3,4} n (5H =o = f 73,4} nf SH. 


This result concerning the interaction of images and inverse images with union and 
intersection is generalized in the next theorem. 


@ THEOREM 4.6.5 
Let 9: A > B bea function with C,D C Aand E,F CB. 
* pP[CUD]=@[C]V@[D]. 
- ePICN DI] Ceg[C]Ne[DPI. 
-g'[EVFl]=9'[Elug" [FI 
-g'[ENFl=@'[Eln@'' [Fl. 


PROOF 
We prove the first and third parts, leaving the others for Exercise 7. By Exer- 
cise 3.2.8(d), 


yEg([CUD] SAx(x ECUDA G(x) = y) 
© Ax(x ECA G(X) = y) V Ax(x € DA G(x) = y) 
eyegl(C]vyeeg|D] 
eyEeg(Cl]Ue[D}. 


In addition, 


xé€@g '[EUF]@ g(x) Ee EUF 
S&S P(xX)EEVQO(X)EF 
exeg! [Elvxeg! [F] 
exeéeg '[E]Ug![F].™ 


It might seem surprising that we only have an inclusion in the second part of Theo- 
rem 4.6.5. To see that the other inclusion is false, let f = {(1, 3), (2,3)}. Then, 


Flt} n {2} = FIP] = 2, 


but 
FUL O £23] = {3} 9 {3} = {3}. 


Hence, f [{1}]n f [{2}] Z f[{1} nm {2}]. However, if f had been a bijection, the 
inclusion would hold (Exercise 13). 


220 Chapter 4 RELATIONS AND FUNCTIONS 


EXAMPLE 4.6.6 


Let f(x) = x2 + 1. We check the union results of Theorem 4.6.5. 


e We have already seen in Example 4.6.4 that f[(1,2)] = (2,5). Since 
(1,2) = 1, 1.5] U[1.5, 2), apply f to both of these intervals and find that 


F{G, 1.5] = 2, 3.25], 
f[[1.5, 2)] = [3.25, 5). 


Therefore, f[(1,2)] = f[C, 1.5]] U f[[1.5, 2)]. 


e Also, from Example 4.6.4, 10: 5)] = (-2,-1) UCI, 2). We can write 
(2,5) as the union of (2, 4) and (3,5). Since 


f-'(2,4)) = (-v3,-1) u (1. v3), 
f-"(G,5)] = (-2,-v2) u (v2.2), 


we have that 


1A lsf 1 yr AG, 


Theorem 4.6.5 can be modified to arbitrary unions and intersections. The proof is 
left to Excercise 17. 


@ THEOREM 4.6.7 
Let {A; : i € I} be a family of sets. 


- [Ua] =Ustad 


iel iel 


- ola] s(t 


iel iel 


-o Ua) =Uott 


iel iel 
- ge! IN A\| = () g '[Ajl. 
iel iel 


We know that the composition of a function and its inverse equals the identity map 
(Theorem 4.5.3). The last two results of the section show a similar result involving the 
image and inverse image using functions that are either one-to-one or onto. 


Section 4.6 IMAGES AND INVERSE IMAGES 221 


M@ THEOREM 4.6.8 
Let gp: A > B bea function. Suppose C € A and DC B. 
+ If g@ is one-to-one, then gy ![g[C]] = C. 
+ If y is onto, then gly! [D]] = D. 
PROOF 


We prove the first part and leave the second to Exercise 8. Suppose that @ is an 
injection. We show that go! [p[C]] =C. 


eLetx € go '[e[C]]. This means that there exists y € @[C] such that 
g(x) = y. Furthermore, there is a z € C so that g(z) = y. Therefore, since 
@ is one-to-one, x = z, which means that x € C. 


e This step will work for any function. Take x € C, so g(x) € g[C]. By 
definition, 


ge '[eIC]] = {z € A: g(z) € gIC]}. 


Since x is also an element of A, we conclude that x € go! [p[C]]. & 


Because we used the one-to-one condition only to prove gy —'[g[C]] € C, we suspect 
that this inclusion is false if the function is not one-to-one. To see this, let f : R-~R 
be defined as f(x) = x”. We know that f is not one-to-one. Choose C = {1}. Then, 


fLficn = fH) = {-1, 1}. 


Therefore, f—'[f[C]] Z C. 
When examining the example, we might conjecture that the function being one-to- 
one is necessary for equality. That this is the case is the following theorem. 


M@ THEOREM 4.6.9 
Let g: A > B bea function. 
¢ If gp! [@[C]] = C for all C C A, then g is one-to-one. 
e Ifo |e"! [D}] = D for all D C B, then @ is onto. 


PROOF 
As with the previous theorem, we prove only the first part. The second part is 
Exercise 8. Assume 


eo '[@e[C]] =C forallC CA. 
Take x,,x, € A and let g(x,) = p(x,). Now, 
{x1} = @ Tel} = (2 € A: ez) = I. 


Because @(x;) = g(x>), we have that {x;,x.} € {£z € A: PZ) = G(X,)}, so 
we must have x; = x>. 


222 Chapter 4 RELATIONS AND FUNCTIONS 


Exercises 


1. Let f : R > R be defined by f(x) = 2x + 1. Find the images and inverse images. 
(a) f[C, 3] 
(b) f[(—co, 0)] 
(c) f(-1, DI 
(d) f7'L(0,2)U(5,8)] 
2. Let g: R > R be the function g(x) = x* — 1. Find the images and inverse images. 
(a) gl{O}] 
(b) g{Z] 
(c) g7'[{0, 15}] 
(4) g '[[-9,-5] v [0, 5]] 
3. Define gy : RXR > Zby g(a, b) = [la] + [|b]. Find the images and inverse images. 
(a) @l{0} x R] 
(b) e[(0, 1) x ©, 1)] 
(c) o1{2,4}] 
(d) g'IN] 
4. Let wy: Z > Zbe a function and define y : P(Z) > P(Z) by y(C) = w[C]. 
(a) Prove that y is well-defined. 
(b) Let y(n) = 2n. Find y[{1,2,3}], y[Z], y~![(1, 2, 3}], and y~![Z]. 
(c) Under what conditions is y one-to-one? 
(d) Under what conditions is y onto? 


. For any function y, show that y[@] = @ and yw [2] = 2. 
. Prove for every B C A, I,[B] = B and (14)! LB] = B. 
. Prove the remaining parts of Theorem 4.6.5. 


. Prove the unproven parts of Theorems 4.6.8 and 4.6.9. 


0 Om NH UA 


Let gp: A > B be an injection and C C A. 

(a) Prove g(x) € g[C] if and only ifx EC. 

(b) Show that Exercise 9(a) is false if the function is not one-to-one. 
10. If w is a function, A C B C dom(y), and C C D C ran(w), show that both 
WIA] C w[B] and y~![C] C wy! [D]. 


11. Letg@: A > Bbea function and take disjoint sets U and V. 
(a) Prove false: If U,V C A, then pP[U]N g[V] = 2. 
(b) Prove false: If U,V ¢ B, then g![U] ng ![V] =. 
(c) What additional assumption is needed to prove both of the implications? 


12. Assume that g and yw are functions such that ran(y) C dom(q@). Let A be a subset 
of dom(y). Prove or show false with a counterexample: (g o y)[A] = g[y[A]]. 


13. Let @: A > B be one-to-one. Prove the following. 


14. 
15. 
16. 


LT: 


Section 4.6 IMAGES AND INVERSE IMAGES 223 
(a) [A] N g[B] ¢ g[An B]. 
(b) If C C Aand D C B, then g[C] = D if and only if go ![D] ='C: 
Prove that if @ : A > Bisa bijection and C C A, then g[A \ C] = B \ g[C]. 
Find a function g : A — Bandaset D C B such that D g glo! [D]]. 


Let wy be an injection with A C dom(y) and B C ran(y). Prove that 


BC w[A] if and only if y~![B] C A. 


Prove Theorem 4.6.7. 


CHAPTER 5 


AXIOMATIC SET THEORY 


5.1 AXIOMS 


When we began studying set theory in Chapter 3, we made several assumptions re- 
garding which things are sets. For example, we assumed that collections of numbers, 
like N, R, or (0, 0) are sets. We supposed that operating with given sets to form new 
collections, as with union or intersection, resulted in sets. We also assumed that for- 
mulas could be used to describe certain sets. All of this seemed perfectly reasonable, 
but since all of these assumptions were made without a carefully thought-out system, 
we would be wise to pause and investigate if we have introduced any problems. 
Consider the following question. Given a formula p(x), is there a set of the form 
{x : p(x)}? Consistent with the attitude of our previous work, we might quickly answer 
in the affirmative. Mathematicians, including Cantor, also initially thought that this was 
the case. However, it was shown independently by Bertrand Russell and Ernst Zermelo 
that not every formula can be used to define a set. For example, let p(x) := x ¢ x and 
A = {x : p(x)} and consider whether A is an element of itself. If A € A, then due 
to the definition of p(x), A ¢ A, and if A ¢ A, then A € A. Because of this built-in 
contradiction, it is impossible for A to be a set. This is known as Russell’s paradox, 
and it was a serious challenge to set theory. One solution would have been to dismiss 


225 


A First Course in Mathematical Logic and Set Theory, First Edition. Michael L. O’ Leary. 
© 2016 John Wiley & Sons, Inc. Published 2016 by John Wiley & Sons, Inc. 


226 Chapter 5 AXIOMATIC SET THEORY 


set theory altogether. The problem was that this new subject combined with advances 
in logic appeared to promise a framework in which to study foundational questions of 
mathematics. David Hilbert famously supported set theory by remarking that “no one 
will drive us out of this paradise that Cantor has created for us,” so dismissal was not 
an option. In order to prevent contradictions such as Russell’s paradox from appearing, 
mathematicians settled on the method of Euclid as the solution, but instead of assuming 
geometric postulates, over a period of time, certain set-theoretic axioms were chosen. 
Their purpose was to define a system by which one could determine whether a given 
collection should be considered a set in such a manner that prevented any contradictions 
from arising. In this chapter we identify the axioms and then redefine N, R, and other 
collections so that we are confident in our previous assumptions regarding them being 
sets. 


Equality Axioms 


We begin with the basics. Although not officially among the set axioms, = is always 
assumed to satisfy the following rules. They are defined to replicate the standard rea- 
soning of equality that we have been using in previous chapters. 


M@ AXIOMS 5.1.1 [Equality] 


Let x, y, and z be variable symbols from theory symbols S. 
e [El] x =x. 
e [E2] x=yoy=x. 
e (E3] x=y,y=z>x=2z. 
Let x9, X1,-.-,X,—1 and yo, y},.-.,¥,—1 be variable symbols from S. 


¢ [E4] For any n-ary function symbol f of S, 


Xo = YoAX, = VAN AXp-1 = Vn-1 
> Sf (Xo, X41, tee »Xy-1) = SQVoMD wees Vp—1) 


¢ [E5] For any S-formula p(u, uy, ... ,U,_1), 


Xo = YoAX, = MAN AXp-1 = Vn-1 
=> p(X9,X1; tee re eD) © WV, V5 vee Vn) 


Axioms E1, E2, and E3 give = the behavior of an equivalence relation (Definition 4.2.4). 
For example, we can use E2 to prove that for any constant symbols cy and c,, 


F Cg =Cy OC) = Cp. 


Section 5.1 AXIOMS 227 


The proof goes as follows: 


Co] 1 


Xo] *1 

C Cc 

0 1 
2S (x1 = Xy)— —_ 

Xo] *1 
cd Cy = Co: 


Axiom E4 allows a function symbol to be used in a proof as a function, and E5 allows 
equal terms to be substituted into formulas with the result being equivalent formulas. 
For example, given the NT-term u + v, by E4, 


Xo = VAX HVS Xo tX1 =YMOt+N, 
and given the NT-formula u + v = vu + u, by ES, 
Xo = YOAX, HVS XQ FX HX, +X VO + V1 = Vi + Yo- 
Formal proofs that require a deduction on an equality need to reference one of the 
equality axioms from Axioms 5.1.1. 
Existence and Uniqueness Axioms 


The axioms will be ST-formulas (Example 2.1.3), where all terms represent sets. We 
begin by assuming the existence of two sets. 


@ AXIOM 5.1.2 [Empty Set] 
AxVy(4y € x). 


In ST-formulas, a witness for the empty set axiom is denoted by { }, although it is 
usually written as 2. 


M@ AXIOM 5.1.3 [Infinity] 
dax({}EexAVulue x > AvW(yeExAUuEeE yAVU[v Eu > VE y)))). 


The standard interpretation of the infinity axiom is that there exists a set that contains 
@ and aU {a} is an element of the set if a is an element of the set. 

In Chapter 3, we noted that the elements of a set determine the set. For example, 
{1,2} = {1,2,2}. This principle is codified by the next axiom. 


Hi AXIOM 5.1.4 [Extensionality] 
VxVyVuluExouecylo~x=y). 


Suppose A, and A, are witnesses to the empty set axiom (5.1.2). Since x ¢ A, and 
x € A, for every x, we conclude that 


x€ A, OX EA). 


228 Chapter 5 AXIOMATIC SET THEORY 


Therefore, A, = A, by extensionality (Axiom 5.1.4), which means that © is the witness 
to the empty set axiom. This uniqueness result does not appear to extend to the infinity 
axiom because both 


{D, {DP}. {DP {Ph} {PS {DL{P{P}t}....} 
and 
{D, {DP}. {PD {Ph} {PS {PL {SP {PH}. 
{PD {PD} {PD {P}} {PD {DP} {PD {Shh}... } 


are witnesses, provided that they are sets. 


Construction Axioms 


Now to build some sets. The next four axioms allow us to do this. 


@ AXIOM 5.1.5 [Pairing] 
VuVusxVw(wexow=uVw=v). 
Suppose that M and WN are sets. Since {M, N} is a witness of 
AaxVw(wExew=Mvw=NQ), 


{M, N} is a set by the pairing axiom, and from this, we conclude that {M,{M, N}}, 
{N,{M,N}}, and {{M,N}} = {{M,N},{M,N}} are sets. Because {M, M} 
equals {M}, pairing along with extensionality prove the existence of singletons. For 
example, if W is a witness to the Infinity Axiom, {2,W}, {@}, and {@,{@,W}} are 
sets. 


AXIOM 5.1.6 [Union] 


Vxdywvutu €E yo Avlu E x Au € vj). 


By the union axiom, U M is aset, and since MUN = U{mM , N}, we conclude 
that M UN is aset. Furthermore, the empty set, union, and pairing axioms can be used 
to prove that for any n € N, there exists a set of the form {dp, a), ...,a,,} (Exercise 3). 


H@ AXIOM 5.1.7 [Power Set] 
Vxdyvulu € yo Volu Eu > ve x)). 
Because of Definition 3.3.1, the power set axiom can be written as 
Vxdywu(u € you Cx). 


We conclude that for every set M, P(M) is a set by the power set axiom, and by exten- 
sionality, P(M) is the unique set of subsets of M. 

The next axiom is actually what is called an axiom scheme, infinitely many axioms, 
one for every formula. They are sometimes called the separation axioms. 


Section 5.1 AXIOMS 229 


HM AXIOMS 5.1.8 [Subset] 


For every ST-formula p(u) not containing the symbol y, the following is an axiom: 


Vxdywu [ue youeExAplu)). 


The formula p(u) in the subset axioms cannot contain the symbol y because the axioms 
yield the existence of this set. If y was among its symbols, the existence of y would 
depend on y. 

The subset axioms yield many familiar sets. 


e Let F bea set. By a subset axiom, there exists a set C such that 
xeCoxe|JFAVc(cEeF +x Ec). 
Observe that the symbol C does not appear in the formula 
Ve(cE€A>xEC). 


Also, observe that the set C is the intersection of F. Hence, al F isa set, and 
since MN N = iM, N}, we conclude that Mn N isa set. 


e By a subset axiom, there exists a set D such that 
xEDOxEMAXEN, 


so M \ N is aset. 


Replacement Axioms 


Given sets A and B, the function g : A > Bisaset because gp C AX Band Ax B 
is a set. Suppose, instead, that the function is defined using a formula p(x, y) and that 
its domain is given by a set A. It cannot be concluded from Axioms 5.1.4—5.1.10 that 
the range {y: x € AA p(x, y)} is a set. However, it appears reasonable that it is. For 
example, define 


PawYYI=yELZAYSxXxAVAZELZAY<Z7%Xx< Z). 
An examination of the formula shows p(3.4, 4) and p(—7.1, —8). Since 


P(X, Y1) A P(X, Y2) > Vy = Yo, 


the formula p(x, y) defines a function (Definition 4.4.10). If the domain is given to be 
[0, co), its range is the set [0, co) MZ. Generalizing to an arbitrary p(x, y), itis expected 
that the range would be a set if the domain is a set. The next axiom scheme guarantees 
this. It was first found in correspondence between Cantor and Richard Dedekind (Can- 
tor 1932) and Dmitry Mirimanoff (1917) with formal versions by Abraham Fraenkel 
(1922) and Thoralf Skolem (1922). 


230 Chapter 5 AXIOMATIC SET THEORY 


H@ AXIOMS 5.1.9 [Replacement] 


For every ST-formula p(t, w) not containing the symbol y, the following is an 
axiom: 


Vx[WuVv,Vu.(u € x A plu, v1) A plu, Vv») > Vy = V2) 
> Ayww(w € y © Att € x A p(t, w)))). 


As an example, every indexed family of sets is a set. To prove this, let J be a set and 
A; be a set for all i € I. Define 


PG, y) = y = Aj. 
Observe that by E2 and E3 (Axioms 5.1.1), 
WHA AV = Ay V1 = Yo: 
Therefore, by a replacement axiom (5.1.9), there exists a set F such that 
we F o ii(tie IAw = A)), 


so ¥ = {A, : i € I} isa set, which implies that the union and intersection of families 
of sets are sets. 

Suppose a € M and b € N. By the subset and pairing axioms, we conclude that 
{a,b}, (a,b) = {{a}, {a, b}} (Definition 3.2.8), and {(a, b)} are sets . Therefore, fixing 
b, 

{{(a, b)}:a€ M} 


is a family of sets, which implies that it is a set. Likewise, 
{{{(a,b}:aEeM}:bEN} 
is a set. Hence, by the union axiom (5.1.6), 
MxN=(JU{{@b}:a€M}:bEN} 


is a set. This implies, using a subset axiom (5.1.8), that any binary relation R such that 
dom(R) C M and ran(R) C N is aset. 


Axiom of Choice 


Suppose that we are given the pairwise disjoint family of sets 
F = {{1,3,5}, {2,9, 11}, {7, 8, 13}}. 
It is easy to find a set S such that 


S' A is a singleton for every A € F. (5.1) 


Section 5.1 AXIOMS 231 


Simply run through the elements of ¥ and choose an element from each set and put it 
in S'. Since F is pairwise disjoint, each choice will differ from the others. For example, 
it might be that 

S = {1,9, 13}. 


However, what happens if F is an infinite set? If there was not a systematic way where 
elements could be chosen from the sets of #, we would be left with making infinitely 
many choices, which is something that we cannot do. Nonetheless, it appears reason- 
able that there is a set S that intersects each member of ¥ exactly once. Such a set 
cannot be proved to exist from Axioms 5.1.2 to 5.1.9, so we need another axiom. It is 
called the axiom of choice. We will have to use it every time that an infinite number 
of arbitrary choices need to be made. 


AXIOM 5.1.10 [Choice] 


If F is a family of pairwise disjoint, nonempty sets, there exists S C (J F such 
that Sq A is a singleton for all A € F. 


The statement of the axiom of choice can be written as an ST-formula (Exercise 12). 
Also, notice that S' in Axiom 5.1.10 is a function (Exercise 13). It is called a selector. 
The next follows quickly from the axiom of choice. In fact, the proposition is equiv- 
alent to the axiom (Exercise 15), so this corollary is often used as a replacement for 
it. 
@ COROLLARY 5.1.11 
For every binary relation R, there exists a function @ such that g C R and 


dom(@g) = dom(R). 


PROOF 
Let R C AXB. Define ¥ = {{a}x[a]p : a € A}, whichis a set by areplacement 
axiom (Exercise 14). Since ¥ is pairwise disjoint and the set {a} X [a]p # © for 
all a € dom(R), the axiom of choice (5.1.10) implies that there exists a selector 
S such that S C J ¥ and S n ({a} x [a] z) is a singleton for all a € dom(R). 
Thus, S C R, and for all a € dom(R), there exists a unique b € B such that 
a Rb. This implies that S' is the desired function. Hi 


Given a family of sets ¥, define a relation R C F¥ x (J F by 
R={(A,a): ACF AaEA}. 


By Corollary 5.1.11, there exists a function € : ¥ > (J F such that E(A) € A for all 
A € F. The function & is called a choice function. 


M@ EXAMPLE 5.1.12 


Let & = {A; : i € I} be a family of nonempty sets. We want to define a family 
of singletons & such that for all i € J, 


if {a;} € @, then a; € Aj. 


232 Chapter 5 AXIOMATIC SET THEORY 


By Corollary 5.1.11, there exists a choice function € : # > (J &. The family 
B= {{E(A;)} : i € T} is the desired set because €(A;) € A; for alli € I. 


There are many theorems equivalent to the axiom of choice (5.1.10). One such result 
involves families of sets. Take n € N and define 


A, = {{O}, (0, 1},...,{0,1,...,m}}. 


Observe that &, is a chain with respect to C and contains a maximal element (Defini- 
tion 4.3.16), which is {0, 1,...,}. However, the chain 


oA = {{0},{0,1},...,{0,1,...,2},.- 4 


has no maximal element. There are many sets that can be added to & to give it a 
maximal element, but the natural choice is to add the union of & to the family giving 


ot’ = {{0}, {0,1},...,{0,1,...,n},...,N}. 


xt’ has a maximal element, namely, N. The generalization of this result to any family 
of sets was first proved by Kuratowski (1922) and then independently by Zorn (1935) 
for whom the theorem is named despite Kuratowski’s priority. The proof given is es- 
sentially due to Zermelo (Halmos 1960). 


@ THEOREM 5.1.13 [Zorn’s Lemma] 


Let & be a family of sets. If L) @ € & for every chain @ of & with respect to 
C, there exists M € &@ such that M ¢ A forall AE &. 


PROOF 
Let € : P(®) \ {9} — A bea choice function (Corollary 5.1.11). For every 
chain @ of of, define 


C$ ={AEA: GU {A} isachain}. 


Notice that @ is the set of elements of / that when added to @ yields a chain. 
Let Ch(.) be defined as the set of all chains of . That both @ and Ch(.%) are 
sets is left to Exercise 17(a). Define 


X : Ch(a#) > Ch() 


by 
X(@) = a ifE\G #2, 
e if@\C=2. 


Next, suppose that @ is a chain of Y such that X(G) = Gp. By the assumption 
on &, we have that |) @ € &. To prove that |) G is a maximal element of #, 
take A € & such that [J G C A. Since A has an element that is not in any of 
the elements of Gp, the union G, U {A} is a chain properly containing Gp, which 
is impossible. We conclude that the theorem is proved if Gp is shown to exist. 


Section 5.1 AXIOMS 233 


To accomplish this, we begin with a definition. A subset 7 of Ch(.&) is called a 
tower when 


“SET, 
-X(@)E¥J forall @ eZ, 
* UD ET forevery chainDcZ. 


Define To() to be the set of all towers of /. Observe that Ch(@) and () To() 
are both towers [Exercise 17(b)]. 

Take C € (]) To() such that C is comparable with respect to inclusion to 
every element of () To(). Such a set C exists since @ is an element of To() 
and comparable to every element of () To(./). Suppose A € ( To(./) such that 
A CC. Because {) To() is a tower, X(A) € [() To(#). If C C X(A), then 
X(A) \ A has at least two elements, which is impossible. Therefore, 


if A € { ] To(o) and A CC, then X(A) CC. (5.2) 


Define 
T={Ae () To(f) >ACCVX(C)€ A}. 


If A C C, then A € X(C) since C C X(C). Hence, every element of T is 
comparable to X(C). Also, T is a tower because of the following: 


* @ ET because 2 € (| To(#) and 9 CC. 


¢ Let B € T. Since (] To(@) is a tower, X(B) € [| To(#). If BC C, 
then X(B) C C by (5.2). If C € B, then X(C) € X(B). In both cases, 
X(B) ET. 


¢ Let @ C T be achain. Then, [J)@ € ()To(#). Suppose there exists 
Cy € @ such that Cp is not a subset of C. This implies that X(C) € Cp. 
Thus, X(C) C UG, so U@ €T. 


Hence, T = (| To() because T C (| To(). Therefore, (] To() is a chain 
because C is arbitrary [Exercise 17(c)]. Since () To(.) is a tower, 


LU 1) To) € () To) 
x (U a To(s)) E () Tow). 


Hence, since (J (] To(#) is an upper bound of (] To(#), 


x(U (To) ¢ U Tm. 
x (U Oc) =(J (To). 


There are many equivalents to the axiom of choice (Rubin and Rubin 1985, Jech 
1973). One of them is Zorn’s lemma. 


and then 


which yields 


234 Chapter 5 AXIOMATIC SET THEORY 


M@ THEOREM 5.1.14 


The axiom of choice is equivalent to Zorn’s lemma. 


PROOF 
Since we have already proved that the axiom of choice (5.1.10) implies Zorn’s 
lemma (Theorem 5.1.13), we only need to prove the converse. We must be careful 
to only use Axioms 5.1.2—5.1.9 and not Axiom 5.1.10. 
Assume Zorn’s lemma and let R be a relation. Define 


A={~p:eCRA isa function}. 
Let @ be a chain of elements of /. 


¢ Take w € [J @, sow € C forsome C € @. This implies that C is a subset 
of R. Therefore, y € R, proving that J) @ C R. 


¢ LU @ is a function by Exercise 4.4.13. 


We conclude that U 6 € A. Thus, by Zorn’s lemma, there exists a maximal 
element ® € &. This means that ® C R and dom(®) € dom(R). To prove that 
dom(R) € dom(®), let (x, y) € R and suppose that x ¢ dom(®). This implies 
that ® U {(x, y)} C Ris a function. Hence, ® U {(x, y)} € &, contradicting the 
maximality of ©. We conclude that ® is the desired function, and the axiom of 
choice follows (Exercise 15). 


There was a controversy regarding the axiom of choice when Zermelo first proposed 
it. Despite mathematicians having previously used it implicitly, some objected to its 
nonconstructive nature. The other axioms yield distinct results, but the axiom of choice 
results in a set with elements that are not clearly identified. Over time, however, most 
objections have faded. This is because the majority of mathematicians regard it as rea- 
sonable and generally those who question the axiom of choice realize that eliminating 
it would lead to serious problems because many proofs in various fields of mathematics 
rely on the axiom. 


Axiom of Regularity 


The ideas that led to the next axiom (also known as the axiom of foundation) can be 
found in Mirimanoff (1917), while the statement of the axiom is credited to Skolem 
(1922) and John von Neumann (1923). 


M@ AXIOM 5.1.15 [Regularity] 
Vx(x #{} > AylyeExaArn7nduu€ yAueEx))). 


The main result of the regularity axiom is that it prevents sets from being elements of 
themselves. Suppose there exists a set A such that A € A. Then, AN {A} # ©, but the 
regularity axiom implies that AM {A} should be empty. 


Section 5.1 AXIOMS 235 


M@ THEOREM 5.1.16 


No set is an element of itself. 


IfV = {x : xisaset} isa set, V € V, contradicting Theorem 5.1.16. Thus, the 
theorem’s corollary quickly follows. 


HM COROLLARY 5.1.17 


There is no set of all sets. 


Because of the regularity axiom (5.1.15), A ¢ A for all sets A. Therefore, {x : x ¢ x} 
is not a set by, which prevents Russell’s paradox from being deduced from the axioms, 
provided that they are consistent (Theorem 1.5.2). 

The axiom of regularity is the final axiom of our chosen collection of axioms. It 
is believed that they do not prove any contradictions, which implies that the axioms 
prevent the construction of {x : x ¢ x} as aset. Therefore, we write the follow- 
ing definition. The collections of axioms are named after the mathematician who was 
primarily responsible for their selection (Zermelo 1908). 


@ DEFINITION 5.1.18 


e Axioms 5.1.2—5.1.8 are the Zermelo axioms. This collection of sentences is 
denoted by Z. 


e Axioms 5.1.2—5.1.8 combined with replacement and regularity (Axioms 5.1.9 
and 5.1.15) are the Zermelo—Fraenkel axioms, denoted by ZF. 


e The Zermelo—Fraenkel axioms with the axiom of choice is denoted by ZFC. 


The nonempty sets that follow from ZFC have the property that all of their elements 
are sets. Sets with this property are called hereditary or pure. Assuming ZFC does 
not prevent us from working with different types of sets, such as sets of symbols or 
formulas, but we must remember that such nonhereditary sets are not products of ZFC, 
so they must be handled with care because we do not want to fall into a paradox. 


Exercises 


1. Let S be a set of theory symbols. Let c,,¢,¢3,c, © S be constant symbols and 
f € 8S bea binary function symbol. Suppose that p(x, y) is an S-formula. Use the 
Equality Axioms (5.1.1) to prove the following. 

(a) k Cy => Cy 

(b) Fey =Q AQ = C3 > ¢, = C3 

(c) Fey =e Ac3 =c4 > f(Cy, 063) = fe, cg) 

(d) Fey = cg Ac3 = cy > [pf (C1), 3) @ PCF (C2), €4)] 
2. Prove VxVy(x =y > Wulue x ouUue yl). 
3. For anyn € N withn > 0, prove that there exists a set of the form {dp, a,,...,d,_1}. 


4. Let F bea family of sets. Prove that P(LJ 1) F) is a set. 


236 Chapter 5 AXIOMATIC SET THEORY 


5. Use a subset axiom (5.1.8) to prove that there is no set of all sets. This proves that 
Russell’s paradox does not follow from the axiom of Z. 


6. Prove that there is no set that has every singleton as an element. 


7. Let A and B be sets. Define the symmetric difference of A and B by 
AA B=A\BUB\A. 


Prove that A A B is a set. 


8. Prove the given equations for all sets A, B, and C. 
(a) AAGD=A 
(bt) AAU=A 
(c) AAB=BAA 
(4d) AA(BAC)=(AA BAC 
(ec) AA B=(AUB)\(BUA) 


(f) (AA B)NC =(ANC) A (BNC) 
9. Given sets I and A; for all i € J, show that U,<, A; and (),<; A; are sets. 


10. Prove that Ay X A, X-++X A,_, is asetif Ag, A,,...,A,_, are sets. 


11. Show that the Cartesian product of a nonempty family of nonempty sets is not 
empty. 


12. Write the Axiom of Choice (5.1.10) as an ST-formula. 

13. Demonstrate that a selector is a function. 

14. In the proof of Corollary 5.1.11, prove that {{a} x [a]p : a € A} isa set. 
15. Prove that Corollary 5.1.11 implies Axiom 5.1.10. 


16. Let R= Rx {0,1}. Find a function F : R > {0,1} such that F(a) = 0 for all 
aeER. 


17. Prove the following parts of the proof of Zorn’s lemma (5.1.13). 
(a) @ and Ch(o) are sets. 
(b) Ch(@) and NTo(a) are towers. 
(c) [) To(#) is a chain. 


18. Prove that ZF and Zorn’s lemma imply the axiom of choice. 


19. Use Zorn’s lemma to prove that for every function g : A — B, there exists a 
maximal C C A such that | C is one-to-one. 


20. Prove that if & is a collection of sets such that |) @ € & for every chain @ C &, 
then & contains a maximal element. 


21. Let (A, R) be a partial order. Prove that there exists an order S' on A such that 
RCS and S is linear. 


Section 5.2 NATURAL NUMBERS 237 


22. Use the regularity axiom (5.1.15) to prove that if {a,{a,b}} = {c, {c,d}, then 
a =c and b = d. This gives an alternative to Kuratowski’s definition of an ordered 
pair (Definition 3.2.8). 


23. Prove that there does not exist an infinite sequence of sets Ag, A,, Az,... such that 
Aj, € A; for alli =0,1,2,.... 


24. Prove that the empty set axiom (5.1.2) can be proved from the other axioms of 
ZFC. 


25. Use Axiom 5.1.9 to prove that the subset axioms (5.1.8) can be proved from the 
other axioms of ZFC. 


26. Let (A, R) be a poset. Prove that every chain of A is contained in a maximal chain 
with respect to C. This is called the Hausdorff maximal princple. 


27. Prove that the Hausdorff maximal principle implies Zorn’s lemma, so is equivalent 
to the axiom of choice. 


5.2 NATURAL NUMBERS 


In order to study mathematics itself, as opposed to studying the contents of mathemat- 
ics as when we study calculus or Euclidean geometry, we need to first develop a system 
in which all of mathematics can be interpreted. In such a system, we should be able to 
precisely define mathematical concepts, like functions and relations; construct exam- 
ples of them; and write statements about them using a very precise language. ZFC with 
first-order logic seems like a natural choice for such an endeavor. However, in order 
to be a success, this system must have the ability to represent the most basic objects 
of mathematical study. Namely, it must be able to model numbers. Therefore, within 
ZFC families of sets that copy the properties of N, Z, Q, R, and C will be constructed. 
Since we can do this, we conclude that N, Z, Q, R, and C are themselves sets, and we 
also conclude that what we discover about their analogs are true about them. We begin 
with the natural numbers. 


@ DEFINITION 5.2.1 


For every set a, the successor of a is the set at = aU {a}. If a is the successor 
of b, then b is the predecessor of a and we write b=a™. 


For example, we have that {3,5,7}* = {3,5,7, {3,5,7}} and @t = {@}. For con- 
venience, write a** for (at)*, so @*+ = {@, {@}}. Also, the predecessor of the set 
(3, {DB}, {S,{B}}} is (S, {D}}, so write (2, {9}, (2, (I}}}- = (9,{O}}. Fur- 
thermore, since @ contains no elements, we have the following. 


M@ THEOREM 5.2.2 


@ does not have a predecessor. 


238 Chapter 5 AXIOMATIC SET THEORY 


Although every set has a successor, we are primarily concerned with certain sets 
that have a particular property. 


@ DEFINITION 5.2.3 
The set A is inductive if 2 € A and at € A foralla € A. 
Definition 5.2.3 implies that if A is inductive, A contains the sets 
GB {D} {DP {PH}, {SP {SP} 1S, {HF}... 
because 


OB" = {OD}, 
{S}" = {9 {S}}, 
{9,{O}}" = {9, {SP}, 19, {S}}}, 
(2, {2S}, {PS {S}}}" = {9 {9}, {D{Sh}, 1 (Sh (Sth, 


Of course, Definition 5.2.3 does not guarantee the existence of an inductive set. The 
infinity axiom (5.1.3) does that. Then, by a subset axiom (5.1.8), 


Vxdywvu(u € you € x Ax is inductive A Vw[w is inductive > u € w)), 
where w is inductive can be written as 
S@ewavaaew-—at Ew). 


Therefore, the collection that contains the elements that are common to all inductive 
sets is a set, and we can make the next definition (von Neumann 1923). 


@ DEFINITION 5.2.4 


An element that is a member of every inductive set is called a natural number. 
Let @ denote the set of natural numbers. That is, 


@ = {9, {9}, {D, {O}}, {P, {SF}, {SP (S}}},... J. 


Definition 5.2.4 suggests that the elements of @ will be interpreted to represent the 
elements of N, so represent each natural number with the appropriate element of N: 


0= 2, 
1= {9}, 
2={2, {S}}, 


3 = {9, {SD}, {S, ({S}}}, 
4= {P, {SP}, {SP {PH}, (P}.{S{S}}h}, 


Section 5.2. NATURAL NUMBERS 239 


Although the choice of which sets should represent the numbers of N is arbitrary, our 
choice does have some fortunate properties. For example, the number of elements in 
each natural number and the number of N that it represents are the same. Moreover, 
notice that each natural number can be written as 


OS: 


1 = {0}, 
2 = {0,1}, 
3 = {0,1,2}, 


4 - {0, 1,2, 3}, 


and also note that 


LJ1=0, 
2a, 
Uja=2, 
U4 =3, 


The empty set is an element of @ by definition, so suppose  € @ and take A to be 
an inductive set. Since n is a natural number, n € A, and since A is inductive, n* € A. 
Because A is arbitrary, we conclude that n* is an element of every inductive set, so 
nt € @ (Definition 5.2.4). This proves the next theorem. 


@ THEOREM 5.2.5 
q@ is inductive. 
The proof of the next theorem is left to Exercise 5. 


@ THEOREM 5.2.6 


If A is an inductive set and A C @, then A = a. 


Order 


Because we want to interpret @ so that it represents N, we should be able to order @ as 
N is ordered by <. That is, we need to find a partial order on @ that is also a well-order. 
To do this, we start with a definition. 


@ DEFINITION 5.2.7 


A set A is transitive means for all a and b, if b € aanda € A, thend € A. 


Observe that @ is transitive because a € © is always false. More transitive sets are 
found in the next theorem. 


240 Chapter 5 AXIOMATIC SET THEORY 


M@ THEOREM 5.2.8 


¢ Every natural number is transitive. 


¢ q@ is transitive. 


PROOF 
The proof of the second part is left to Exercise 4. We prove the first by defining 


A={n€@: nis transitive}. 


We have already noted that 0 € A, so assume that n € A and let b € a and 
aenu {n}. Ifa € {n}, then b € n. If a € n, then by hypothesis, b € n. In 
either case, b € n* since n C nt, sont € A. Hence, A is inductive, so A = @ 
by Theorem 5.2.6. 


In addition to being transitive, each element of w has another useful property that 
provides another reason why their choice was a wise one. 


@ LEMMA 5.2.9 
If m,n € @, then m C nif and only if m En. 


PROOF 
Take m,n € a. If m En, then m C n since nis transitive (Exercise 3). To prove 
the converse, define 


A={kEea:VIIE@AlLCk lek}. 


We show that A is inductive. Trivially, 0 € A. Now taken € A and letm € @ 
such that m C n*. We have two cases to consider. 


¢ Suppose n ¢ m. Then, m C n. If m C n, then m € n by hypothesis, which 
implies that m € nt. If m =n, then m € nt. 


e Next assume that n € m. Take x € n. Since m is transitive by Theo- 
rem 5.2.8, we have that x € m. This implies that n U {n} C m, but this is 
impossible since m C n*, son ¢ém. & 


For example, we see that 2 € 3 and 2 C 3 because 
{S,{S}} € {S, {OP}, {S, {S}}} 


and 
{D,{P}} C{SD, {DP}, {P. {Dh }}. 


Now use Lemma 5.2.9 to define an order on @. Instead of using <, we use < to copy 
the order on N. 


Section 5.2. NATURAL NUMBERS 241 


@ DEFINITION 5.2.10 
For all m,n € a, let 
m < nif and only if m Cn. 


Define m < nto meanm <nbutm#n. 


Lemma 5.2.9 in conjunction with Definition 5.2.10 implies that for all m,n € @, we 
have that 


m < nif and only if m C nif and only ifm En. 


The order of Definition 5.2.10 makes @ a chain with @ as its least element. 


@ THEOREM 5.2.11 


(@, <) is a linear order. 


PROOF 
That (@, <) is a poset follows as in Example 4.3.9. To show that @ is a chain 
under <, define 


A={kea:VILE@A[kK<IVI<k))}. 
We prove that A is inductive. 
¢ Since @ is a subset of every set, 0 € A. 


¢ Suppose that n € A and let m € w. We have two cases to check. First, 
assume that n < m. If n < m, then n* < m, while if n = m, then m < nt. 
Now suppose that m < n, but this implies that m < n*. In either case, 
nteA.n 


We can now quickly prove the following. 


HM COROLLARY 5.2.12 


For all m,n € @, if m* = n*, then m =n. 


PROOF 
Let m and n be natural numbers and assume that 


mU{m}=nU {nr}. (5.3) 


Take x € m. Then, x € norx =n. If x En, then m C n, so suppose that 
x =n. This implies that n € m, som ¥ n by Theorem 5.1.16. However, we then 
have m € n by (5.3), which contradicts the trichotomy law (Theorem 4.3.21). 
Similarly, we can prove that n C m. 


Since the successor defines a function S : @ > w where S(n) = n*, Corollary 5.2.12 
implies that .S is one-to-one. 

Thereom 5.2.11 shows that (@, <) is a linear order as (N, <) is a linear order. The 
next theorem shows that the similarity goes further. 


242 Chapter 5 AXIOMATIC SET THEORY 


M@ THEOREM 5.2.13 


(@, <) is a well-ordered set. 


PROOF 
By Theorem 5.2.11, (@,<) is a linear order. To prove that it is well-ordered, 
suppose that A C @ such that A does not have a least element. Define 


B={keo: {0,1,...,.k}N A=}. 
We prove that B is inductive. 


e If 0 € A, then A has a least element because 0 is the least element of @. 
Hence, {O}N A= 2, so0 € B. 


¢ Let n € B. This implies that {0, 1, ...,2} M A is empty. Thus, n* cannot 
be an element of A for then it would be the least element of A. Hence, 
{0,1,...,2,nt}M A= © proving that nt € B. 


Therefore, B = w, which implies that A is empty. 


Recursion 
The familiar factorial is defined recursively as 
eae (5.4) 
(n+1)!=(n4+ 1)n! EN). (5.5) 


A recursive definition is one that is given in terms of itself. This is illustrated in 
(5.5) where the factorial is defined using the factorial. It appears that the factorial is a 
function N — N, but a function is a set, so why do (5.4) and (5.5) define a set? Such a 
definition is not found among the axioms of Section 5.1 or the methods of Section 3.1. 
That they do define a function requires an important theorem. 


M@ THEOREM 5.2.14 [Recursion] 


Let A be aset anda € A. If gis afunction A — A, there exists a unique function 
f :@-— A such that 


* f(0) =a, 
© f(n*) = g(f(m) foralln Eo. 
PROOF 
Let g : A > A bea function. Define 


F={h:hCoxArQO,aeh 
AWnVy[(n, y) € h = (nt, g(y)) € A))}. 


Note that ¥ is a set by a subset axiom (5.1.8) and F is nonempty because @X A € 


F. Let 
f=()F. 


(5.6) 


Section 5.2 NATURAL NUMBERS 243 


Observe that f € F (Exercise 8). Define 


D=({neEo:az(n,z) € flAWW aye frAtnylEefry=y'l}. 
We prove that D is inductive. 


e Since F # ©, we know that (0, a) € f by (5.6), so let (0, 5) also be an 
element of f. If we assume that a # b, then f \ {(0,b)} € F¥, which is 
impossible because it implies that (0, b) € () ¥. Hence, 0 € D. 


¢ Suppose that n € D. This means that (n,z) € f for some z € A. Thus, 
(nt, 2(z)) € f by (5.6). Assume that (n*,y) © f. If y # g(2), then 
f \{(nt, y)} © F, which again leads to the contradictory (n*, y) € (| F. 


Hence, f is a function @ — A (Theorem 5.2.6). We confirm that f has the 
desired properties. 


¢ f(0) = a because (0, a) € [| F. 


e Take n € wand write y = f(n). This implies that (n, y) € f, so we have 
that (n*, g(y)) € f. Therefore, f(n*) = g(y) = g(f(n)). 


To prove that f is unique, let f’ : @ > A be a function such that f’(0) = a 
and f/(n*) = g(f/(n)) for all n € @. Let 


E={neéo: f(n)= f'n}. 
The set E is inductive because f(0) = a = f'(0) and assuming n € @ we have 
that f(n*) = g(f(n)) = g(f'() = f'(n*). 


The factorial function has domain N but the recursion theorem (5.2.14) gives a func- 
tion with domain @ and uses the successor of Definition 5.2.1. We need a connection 
between N and w. We do this by defining two operations on @ that we designate by + 
and - and then showing that the basic properties of @ under these two operations are 
the same as the basic properties of N under standard addition and multiplication. 


Arithmetic 


We begin with addition. Let g : @ > @ be defined by g(n) = n™. For every m € a, by 
Theorem 5.2.14, there exists a unique function f,, : @ — @ such that 


fm) =m 
and for alln € a, 
Fin") = 8 fm(2)) = Ln]. 


Define 
a= {((m,n), f,,()) : m,n € o}. 


Since f,,, is a function for every n € @, the set a is a binary operation (Definition 4.4.26). 
Observe that for all m,n € a, 


244 Chapter 5 AXIOMATIC SET THEORY 


¢ a(m,0) =m, 
© a(m,n*) = a(m,n)*. 


We know that for every k,/ € N, we have that k + 0 = k and to add k + /, one simply 
adds | a total of / times to k. This is essentially what a does to the natural numbers. 
For example, for 1,3,4 EN, 


143=(d+D+4+1)+)=4 
and for 1,2,3,4 € @ (page 238), 


a(1,3) = a(1, 2) 
= a(1,2)* 
=a. 1)" 
= a(l, 1)tt 


= 4. 


Therefore, we choose a to be addition on w. To define the addition, we only need to cite 
the two properties given in Theorem 5.2.14. Since each f,,, is unique, there is no other 
function to which the definition could be referring. Therefore, we can define addition 
recursively. 


@ DEFINITION 5.2.15 
For all m,n € @, 
em+0=m, 
em+nt=(m+n). 


Notice that mt = (m+ 0)* = m+0* =m +1. Furthermore, using the notation from 
page 238, we conclude that 1 + 3 = 4 because a(1, 3) = 4. 

The following lemma shows that the addition given in Definition 5.2.15 has a prop- 
erty similar to that of commutativity. 


@ LEMMA 5.2.16 
For all m,n € a, 
© O+n=n. 


emt +n=(m+n)*. 


Section 5.2 NATURAL NUMBERS 245 


PROOF 
Let m,n € o. To prove that 0 + =n, we show that 


A={keEoa:0+k=k} 
is inductive. 
¢ 0 € A because 0+ 0 = 0. 
¢ Letn € A. Then,0O+ nt =(04+n)t =n", sont EA. 
To prove that m* +n = (m+ n)t, we show that 
B={k€a:m' +k=(m4+k)*} 
is inductive. 
¢ Again, 0 € B because 0 + 0 = 0. 
¢ Suppose that n € B. We have that 
mt +nt = (mt +n)t =(m4+n)tt =(m4n'),” 
where the first and third equality follow by Definition 5.2.15 and the second 
follows because n € B. Thus, nt € B. Hl 


Now to see that + behaves on w as + behaves on N, we use Definition 4.4.31. 


@ THEOREM 5.2.17 
¢ The binary operation + on @ is associative and commutative. 


¢ (is the additive identity for w. 


PROOF 
0 is the additive identity by Definition 5.2.15 and Lemma 5.2.16, and that + is 
associative on @ is Exercise 14. To show that + is commutative, let m € @ and 
define 
A={ke€a:m+k=k+m}. 


As has been our strategy, we show that A is inductive. 


¢ m+0 = m by Definition 5.2.15, and 0 + m = m by Lemma 5.2.16, so 
OEA. 


e Letn € A. Therefore, m+n =n-+™m, which implies that 
m+nt =(m+n)t =(nt+m)t =nt +m. 
Hence, nt € A. 
Multiplication on N can be viewed as iterated addition. For example, 
3-4=34343+43, 


so we define multiplication recursively along these lines. As with addition, the result 
is a binary operation by Theorem 5.2.14. 


246 Chapter 5 AXIOMATIC SET THEORY 


HM DEFINITION 5.2.18 
For all m,n € a, 
em-0=0, 
em-nt=m-nt+m. 
For example, 3 - 4 = 12 because 
3-4=3-3+3 
=(3-24+3)4+3 
= ((3-1+3]+3)4+3 
= ([((3 -0+ 3) +3] +3) +3 
= ((3 +3] +3)4+3 
= 12, 


where (([3 + 3] + 3) +3 = 12 is left to Exercise 13. 
The next result is analogous to Lemma 5.2.16. Its proof is left to Exercise 9. 


@ LEMMA 5.2.19 
For all m,n € a, 
-0-m=0, 


+ 


en '-m=n-m+m. 


We now prove that - on @ behaves as - on N and that + and - on @ interact with each 
other via the distributive law as the two operations on N. For the proof, we introduce 
two common conventions for these two operations. 


e So that we lessen the use of parentheses, define - to have precedence over +, and 
read from left to right. That is, 


m-n+o=(m-n)+oandm+n-o0=m+4+(n-0), 
and 
m+n+o=(m+n)+oandm-n-o=(m-n)-o. 
e Define mn = m-n. 
@ THEOREM 5.2.20 
e The binary operation - on @ is associative and commutative. 
e | is the multiplicative identity for o. 
¢ The distributive law holds for w. This means that for all m,n,o € a, 


m(n + 0) = mn + mo. 


Section 5.2. NATURAL NUMBERS 247 


PROOF 
That the associative and commutative properties hold is Exercise 12. To prove 
the other parts of the theorem, let m,n, o € w. Since 0* = 1, by Definition 5.2.18 
and Lemma 5.2.16, 


ml =m0t =m04+m=04m=m, 


and by Lemma 5.2.19 and Definition 5.2.15, we have that 1m = m. To show that 
the distributive law holds, define 


A={kE€a:k(n+0)=kn+ko}. 


Since 0(n + 0) = 0 and On + 0o = 0+ 0 = 0, we have that 0 € A. Now suppose 
that m € A. That is, 
m(n + 0) = mn + mo. 


Therefore, m* € A because 
mt(n+o0)=m(n+o0)+(n+0) 
= (mn + mo) + (n+0) 
= (mn +n) + (mo +0) 
=mtn+m'o. Ei 
@ EXAMPLE 5.2.21 


We revisit the factorial function. Since addition and multiplication are now de- 
fined on @, let g: @X @ > w Xo be the function 


a(k,l) =(k+1,(k+ Il). 


Theorem 5.2.14 implies that there exists a unique function f : @ > w X @ such 
that 


f(0) = 0, 1) 
and 
fiat 1) = g(f(@)). 


Let z be the projection map 2(k,/) = 1. By Theorem 5.2.6 and Exercise 10, we 
conclude that for all n € a, 


fiat l=(n4+1, (04 I(r f)@). (5.7) 


Therefore, the factorial function n!, which is defined recursively by 


0! =1, 
(n+1)!=(+1)n! NE€o), 


is z o f by the uniqueness of /f. 


248 Chapter 5 AXIOMATIC SET THEORY 


To solve an equation like 6+2x = 14 where the coefficients are from N, we can use 
the cancellation law and write 


64+ 2x = 14, 
6+2x=6+8, 
2x = 8, 
2x =2-4, 
x=4. 


For equations with coefficients in w, we need a similar law. 


@ THEOREM 5.2.22 [Cancellation] 


Let a,b,c € @. 
«Ifa+b=a+c,thenb=c. 


¢ Ifab=ac anda #0, then b=c. 


PROOF 
The proof for multiplication is Exercise 15. For addition, define 


A={kEa:VNmdIe€animeaod([(kt+l=k+m—-1=m))}. 


Clearly, 0 € A, so assume n € A. To prove that nt € A, let b,c € w and suppose 
that n* +b = nt+c. Then, (n+b)t = (n+c)* by Lemma 5.2.16, sont+b =n+c 
by Corollary 5.2.12. Hence, b = c because n € A. Hf 


Exercises 


1. For every set A, show that At is a set. 


2. For every n € a, show that nttt++ =n+5. 


3. Show that the following are equivalent. 
e Ais transitive. 
e IfaeA,thenacC A. 
° AC P(A). 
° UACA. 
© UAN=A. 
Prove that @ is transitive. 
. Prove Theorem 5.2.6. 
Let A C w. Show that if |) A = A, then A =a. 


Take u, v,x, y € w and assume thatu +x = v+y. Prove that u € v if and only if 
Ex. 


<N Dw 


Section 5.3 INTEGERS AND RATIONAL NUMBERS 249 


8. Prove that f € F in the proof of Theorem 5.2.14. 
9. Prove Lemma 5.2.19. 
10. Prove (5.7) from Example 5.2.21. 


11. Let A be a set and g : A > A bea one-to-one function. Take a € A \ ran(@). 
Recursively define f : @ — A such that 


f() =a, 
f(n*) = of). 
Prove that f is one-to-one. 


12. Let m,n,o € @. Prove the given equations. 
(a) mn=nm. 
(b) m(n+o0) =mn+mo. 
13. Show that ((3 + 3] + 3)+3 = 12. 
14. Prove that addition on @ is associative. 
15. For all a,b,c € w with a 4 0, prove that if ab = ac, then b=c. 
16. Show that for all m,n € @, if mn = 0, then m = 0 or n = 0. 


17. We define exponentiation on @. For all n € a, 


(a) Use the recursion (Theorem 5.2.14) to prove that this defines a function @ x 
o->a. 
(b) Show that exponentiation on @ is one-to-one. 


18. Let x, y,x € w. Use the definition of Exercise 17 to prove the given equations. 
(a) x¥t? = xVx%, 
(b) (xy)? = x7y’. 
Cr @y sx 


19. Let m,n,k € w. Assume that m < n and 0 < k. Demonstrate the following. 
(a) m+k<n+t+k. 
(b) mk < nk. 
(c) mk <nk. 


5.3 INTEGERS AND RATIONAL NUMBERS 


Now that we have defined within set theory the set of natural numbers and confirmed 
its basic properties, we wish to continue this with other sets of numbers. We begin with 
the integers. 


250 Chapter 5 AXIOMATIC SET THEORY 


Integers 


We build the integers using the natural numbers. The problem is how to define the 
negative integers. We need to decide how to represent the adjoining of the negative 
sign to a natural number. One option that might work is to use ordered pairs. These are 
always good options when extra information needs to be included with each element 
of a set. For example, (4,0) could represent 4 because 4 — 0 = 4, and (0,4) could 
represent —4 because 0 — 4 = —4. However, this is a problem because we did not 
define subtraction on @, so we need another solution. Our decision is to generalize 
this idea of subtracting coordinates to the set @ X @, but use addition to do it. Since 
there are infinitely many pairs (m, n) such that m — n = 4, we equate them by using the 
equivalence relation of Exercise 4.2.2. 


HM DEFINITION 5.3.1 
Let R be the equivalence relation on w X @ defined by 
(m,n) R (m',n’) if and only if m+n! =m! +n. 
Define Z = (w x w)/R to be the set of integers. 


Using Definition 5.3.1 we associate elements of Z with elements of Z: 


Z Z 
—2 [(0,2)] 
sd OT) 
0 [0,9)] 
1 [d,0)] 
2 [2,9)] 


Notice that because 0 and 2 are elements of @, the equivalence class [(0, 2)] is the name 
for [(S, {S, {S}})]. Also, 


[(0, 2)] = [0,3)] = 12,4] =. +. 


The ordering of Z is defined in terms of the ordering on @. Be careful to note that the 
symbol < will represent two different orders, one on Z and one on @ (Definition 5.2.10). 
This overuse of the symbol will not lead to confusion because the order will be clear 
from context. For example, in the next definition, the first < is the order on Z, and the 
second < is the order on @. 


Hi DEFINITION 5.3.2 
For all [(m, n)], [Gn’,n’)] € Z, define 
[(m, n)] < [(m',n’)] if and only if m+n! < m' +n 


and 


Section 5.3 INTEGERS AND RATIONAL NUMBERS 251 


[(m, n)] < [(m', n’)] if and only if m+n! < m! +n. 


Since —4 = [(0, 4)] and 3 = [(5, 2)], we have that —4 < 3 because0 +2 <5+4+4. 
Since (Z, <) has no least element (Exercise 3), it is not well-ordered. However, it is 
a chain. Note that in the proof the symbol < is again overused. 


@ THEOREM 5.3.3 
(Z, <) is a linear order. 


PROOF 
We use the fact that the order on @ is a linear order (Theorem 5.2.11). Let a,b € 
Z, so a = (m,n) and b = (m',n’) for some m,n, m’,n' € o. 


Because m+n < m+n, we have a < a. 


Suppose that a < band b < a. This implies that m+n’ < m' +n and 
m'+n<m-+n’. Since < is antisymmetric on @, m+n! = m’ +n, which 
implies that a = b, so < is antisymmetric on Z. 


Using a similar strategy, it can be shown that < is transitive on Z (Exer- 
cise 4). Thus, (Z, <) is a poset. 


Since (@, <) is a linear order, m+n! < m! +norm' +n< m+n’. This 
implies that a < bor b <a, so Z is a chain under <. @ 


Although @ is not a subset of Z like N is a subset of Z, the set @ can be embedded 
in Z (Definition 4.5.29). This is shown by the function g : @ — Z defined by 


p(n) = [(n, 0)].- (5.8) 


We see that ~ is one-to-one because [(m, 0)] = [(n, 0)] implies that m = n. The function 
g is then an order isomorphism using the relation from Definition 5.3.2 (Exercise 1). 
As with w, we define what it means to add and multiply two integers. 


Ml DEFINITION 5.3.4 
Let [(m, n)], [(u, v)] € Z. 


e [(m,n)] + [Cu v)] = [mM +u,n+ v)]. 


e [(m,n)] - (Cu, v)] = [Gnu + nv, mv + un)]. 


For example, the equation in Z, 
[(5, 2)] + 17, D] = (2, 3)], 
corresponds to the equation in Z, 


3+6=9, 


252 Chapter 5 AXIOMATIC SET THEORY 


and the equation 
[(5,2)] - (7, J = ((35 + 2,5 + 14)] = [G7, 19)] 


corresponds to the equation 
3-6= 18. 


Before we check the properties of + and - on Z, we should confirm that these are 
well-defined. Exercise 5 is for addition. To prove that multiplication is well-defined, let 
[(m, n)] = [(m', n’)] and [(u, v)] = [(u’, v’)]. Then, m+n! =m’ +nandu+o! =u! +0. 
Hence, by Theorem 5.2.20 in @ we have that 


mu +n'u=m'ut nu, 
mu +n'v = m'v +nv, 
mut mo! = m'u' + m'v, 


nutn'd =n'u +n'v. 

Therefore, 

mutnlutmot+nvtmutm'o! +n'u' +n'v 
equals 

, , og , , i 

mutnutmv+nvo+mu+mvu+nutnoe, 
so by Cancellation (Theorem 5.2.22), 

mutnvutm'o! +n’ =mvtnutm'd +n'd'. 
This implies by Definition 5.3.1 that 

[(mu + nv, mv + nu)] = [(m'u! +.n'0', m’v' +n'v’))], 
and this by Definition 5.3.4 implies that 
[(m,n)] - [(u, v)] = [(m',n’)] -[’, v’)]. 


We follow the same order of operations with + and - on Z as on @ and also write 
mn for m-n. 


M@ THEOREM 5.3.5 


e The binary operations + and - on Z are associative and commutative. 
e [(0,0)] is the additive identity for Z, and [(1,0)] is its multiplicative identity. 
¢ For every a € Z, there exists an additive inverse of a. 


e For all a,b,c € Z, the distributive law holds. 


Section 5.3 INTEGERS AND RATIONAL NUMBERS 253 


PROOF 
We will prove parts of the first and third properties and leave the remaining prop- 
erties to Exercise 6. Let a, b,c € Z. This means that there exist natural numbers 
m,n,r, S,u, and v such that a = [(m, n)], b = [(r, s)], and c = [(u, v)]. Then, 
at+b+c=[(m,n)]+[(r,5)] + [@, v)] 
=[(m+r,n+s)]+[lu,v)] 
=[(m+rt+u,n+s+0v)] 
=[(m+[rtu],n+[s+v])] 
= [(m,n)] +[(rtu,5+4+0)] 
= [(m, n)] + ([(r, 5)] + [@, v)]) 
=at+(b+c) 


and 


abc = [(m,n)] - [(r, 5)] -[w, v)] 
(mr + ns, ms + nr)] - [(u, v)] 
(u[mr + ns] + u[ms + nr],v[mr+ns] +u[ms + nr])] 


(m [ru + sv] +n[su+rv],m[su+rv] +n[rut sv])] 
(m,n)]- [((ru + sv, su+rv)] 

(m,n)] - ([(r, 5)] - [@, v)) 

= a(be). 


[ 
=[ 
=[ 
= [(umr + uns + ums + Unr, Umr + Uns + ums + unr)] 
=[ 
=[ 
[ 


To prove that every element of Z has an additive inverse, notice that 
[(m, n)] + [(n,m)] = [(m +n, n + m)] = [(0,0)]. 
Therefore, if a = [(m, n)], the additive inverse of a is [(n, m)]. Hi 


For all n € Z, denote the additive inverse of n by —n, and for all m,n, r,s € @, define 


[(m, n)] — [(r, s)] = [@m, n)] + [(s,r)]. 


Rational Numbers 


As the integers were built using @, so the set of rational numbers will be built using Z. 
Its definition is motivated by the behavior of fractions in Q. For instance, 

Bate 
B42 
because 2 - 12 = 3 - 8. Imagining that the ordered pair (m,n) represents the fraction 
m/n, we define an equivalence relation. Notice that this is essentially the relation from 
Example 4.2.5. Notice that Z is defined using addition (Definition 5.3.1) while the 
rational numbers are defined using multiplication. 


254 Chapter 5 AXIOMATIC SET THEORY 


HB DEFINITION 5.3.6 
Let .S be the equivalence relation on Z x (Z \ {0}) defined by 


(m,n) S' (m',n’) if and only if mn! = m’n. 
Define Q = [Z x (Z \ {0})]/S to be the set of rational numbers. 


Using Definition 5.3.6, we associate elements of Q with elements of Q: 


2 ee i ees 
=—2 [(-2,1) 1/2 [(1,2)] 
-1 [C1,)] 2/3 [(2,3)] 
0 [0,1] 3/4 [(3,4)] 

1 [d,)] 4/5 [(4,5)] 
[(2, 1)] [(5,6)] 


5/6 


Notice that since 1,2 € Z, the equivalence class [(1, 2)] is the name for 


LI, Oe [2,0 als = 11S}, Dir ld, {P}}, Delys, 
where R is the relation of Definition 5.3.1. Also, 
[, 2)] = [2,4] =[G,]=---. 
We next define a partial order on Q. 


@ DEFINITION 5.3.7 
For all [(m, n)], [(m’, n’)] € Q, define 
[(m, n)] < [(m',n’)] if and only if mn’ < nm! 
and 


[(m, n)] < [(m', n’)] if and only if mn’ < nm’. 


When working with Q, we often denote [(m, n)] by m/n or = and [(m, 1)] by m. Thus, 
since 2/3 = [(2,3)] and 7/8 = [(7, 8)], we conclude that 2/3 < 7/8 because 2-8 < 3-7. 
As @ can be embedded in Z, so Z can be embedded in Q. The function that can be 


used is 
w:Z— [Zx(Z\ {0})] /S 


defined by 
y(n) = [(n, 1]. (5.9) 
(See Exercise 2) Using the function @ (5.8), we see that @ can be embedded in Q via 


the order isomorphism y © @. 
Since (Q, <) has no least element (Exercise 3), it is not well-ordered. However, it is 


a chain (Exercise 11). 


Section 5.3 INTEGERS AND RATIONAL NUMBERS 255 


@ THEOREM 5.3.8 


(Q, <) is a linear order. 


Lastly, we define two operations on Q that represent the standard operations of + 
and - on Q. 


Hf DEFINITION 5.3.9 
Let [(m, n)], [(m’,n’)] € Q. 


© [(m,n)] + [(m', n’)] = [(mn! + m'n, nn’)). 
© [(m,n)] + [(m',n")] = [(mm', nn’). 


That + and - as given in Definition 5.3.9 are binary operations is left to Exercise 13. 
As examples of these two operations, the equation in Q 


[1,3)] +(G6,4)] = [1 -44+3-3,3-4)] = (03, 12)] 


corresponds to the equation in Q 


and the equation in Q 
(1, 3)] - (3, 4)] = [0 - 3,3 - 4)] = [G, 12)] = (0, 4)] 


corresponds to the equation in Q 


We follow the same order of operations with + and - on Q as on Z and also write 
mn for m-n. 


@ THEOREM 5.3.10 


e The binary operations + and - on Q are associative and commutative. 
e [(O, 1)] is the additive identity, and [(1, 1)] is the multiplicative identity. 


e Every rational number has an additive inverse, and every element of Q\ {[(0, 1)]} 
has a multiplicative inverse. 


e The distributive law holds. 


256 Chapter 5 AXIOMATIC SET THEORY 


PROOF 
Since [(m,n)] - [1,1] = [Qm- 1,n- 1)] = [(m,n)] for all [(m,n)] € Q and 
multiplication is commutative, the multiplicative identity of Q is [(1, 1)]. Also, 
because m # 0 and n ¥ 0 implies that 


[(m,n)] - [(n,m)] = [(mn, nm)] = [C, DI, 


every element of Q \ [(0, 1)] has a multiplicative inverse. The other properties 
are left to Exercise 14. 


For all a, b € Q with b ¥ 0, denote the additive inverse of a by —a and the multiplicative 
inverse of b by b7!. 


Actual Numbers 


We now use the elements of w, Z, and Q as if they were the actual elements of N, Z, 
and Q. For example, we understand the formula n € Z to mean that n is an integer 
of Definition 5.3.1 with all of the properties of n € Z. Also, when we write that the 
formula p(n) is satisfied by some rational number a, we interpret this to mean that p(a) 
with a € Q where a has all of the properties of a € Q. This allows us to use the set 
properties of a natural number, integer, or rational number when needed yet also use 
the results we know concerning the actual numbers. That the partial orders and binary 
operations defined on @, Z, and Q have essentially the same properties as those on N, 
Z, and Q allows for this association of properties to be legitimate. 


Exercises 


1. Prove that the function @ (5.8) is an order isomorphism using the order of Defini- 
tion 5.3.2. 


2. Prove that the function y (5.9) is an order isomorphism using the order of Defini- 
tion 5.3.7. 


3. Show that (Z, <) and (Q, <) do not have least elements. 

4. Prove that < is transitive on Z. 

5. Prove that + is well-defined on Z. 

6. Complete the remaining proofs of the properties of Theorem 5.3.5. 

7. Show that every nonempty subset of Z~ = {n:n € ZAn < O} has a greatest 
element. 


8. Let m,n € Zand k € Z (Exercise 7) Prove that if m < n, then nk < mk. 


9. Prove that for every m,n € Z, if mn = 0, then m = 0 orn = 0. Show that this same 
result also holds for Q. 


10. The cancellation law for the integers states that for all m,n, k € Z, 
e ifm+k=n+k,thenm=n, 
e if mk =nk andk #0, thenm =n. 


Section 5.4 MATHEMATICAL INDUCTION 257 


Prove that the cancellation law holds for Z. Prove that a similar law holds for Q. 
11. Prove that (Q, <) is a chain. 


12. Demonstrate that every nonempty set of integers with a lower bound with respect 
to < has a least element. 


13. Show that + and - as given in Definition 5.3.9 are binary operations on Q. 
14. Prove the remaining parts of Theorem 5.3.10. 
15. Let a,b,c,d € Q. Prove. 


(a) If0<a<cand0<b<d,thenab<cd. 
(b) If0 <a<cand0 <b <d, thenab < cd. 


16. Let a,b € Q. Prove that if b < 0, thena+b <a. 
17. Let a,b € Q. Prove that if a > 0 and b < 1, then ab < a. 


18. Prove that between any two rational numbers is a rational number. This means that 
Q is dense. Show that this is not the case for Z. 


19. Generalize the definition of exponentiation given in Exercise 5.2.17 to Z. Prove 
that this defines a one-to-one function Z x (Zt U {0}) > Z. Does the proof require 
recursion (Theorem 5.2.14)? 


20. Let m,n,k € Z. Assume that m < n. Demonstrate the following. 
(a) m+k<n+t+k. 
(b) mk <nkifO<k. 
(c) nk <mkifk <0. 
(d) m‘ < nk ifk>0. 


21. Let p : w > Z be the embedding defined by (5.8). Define f : ax @ > @ by 
f(m,n) = m" and g : Zx (Zt U {0}) > Z by gu, v) = uv”. Prove that for all m,n € a, 
we have that g(g(m), p(n)) = o(f (m, n)). Explain the significance of this result. 


5.4 MATHEMATICAL INDUCTION 


Suppose that we want to prove p(n) for all integers n greater than or equal to some no. 
Our previous method (Section 2.4) is to take an arbitrary n > ng and try to prove p(n). 
If this is not possible, we might be tempted to try proving p(n) for each n individually. 
This is impossible because it would take infinitely many steps. Instead, we combine 
the results of Sections 5.2 and 5.3 and use the next theorem. 


Hi THEOREM 5.4.1 [Mathematical Induction 1] 
Let p(k) be a formula. For any ng € Z, if 
P(N) AWn € Z[n = ng A p(n) > p(n + 1], 


then 
Vn € Z[n= ng - p(n). 


258 Chapter 5 AXIOMATIC SET THEORY 


PROOF 
Assume p(n) and that p(k) implies p(k + 1) for all integers k > no. Define 


A={k€a: play +h}. 


Notice that 0 € A © p(n), 1 € A & p(n + 1), and so on. To prove p(k) for all 
integers k > no, we show that A is inductive. 


¢ By hypothesis, 0 € A because p(n). 


e Assume n € A. This implies that p(mp + n). Therefore, p(ny + n+ 1), so 
n+1eEA.0 


Theorem 5.4.1 gives rise to a standard proof technique known as mathematical 
induction. First, prove p(mj)). Then, show that p(m) implies p(n + 1) for every integer 
n > No. Often an analogy of dominoes is used to explain this. Proving p(n) is like 
tipping over the first domino, and then proving the implication shows that the dominoes 
have been set properly. This means that by modus ponens, if p(ng + 1) is true, then 
P(N + 2) is true, and so forth, each falling like dominoes: 


P(N) 
P(N) > PM +1). pM + 1) 
Ding +1) > p(tg +2). P(g +2)--- 


This two-step process is characteristic of proofs by mathematical induction. So 
much so, the two stages have their own terminology. 


e Proving p(np) is called the basis case. It is typically the easiest part of the proof 
but should, nonetheless, be explicitly shown. 


¢ Proving that p(n) implies p(n + 1) is the induction step. For this, we typically 
use direct proof, assuming p(n) to show p(n + 1). The assumption is called the 
induction hypothesis. 


Often induction is performed to prove a formula for all positive integers, so to represent 
this set, define 
Z* = {1,2,3,...}. 
WM EXAMPLE 5.4.2 
Prove p(k) for all k € Z*, where 


k(k+D)Qk+1 
pk) = P42? 4+---4h = ee. 
Proceed by mathematical induction. 

e The basis case p(1) holds because 


11+ 12-141) _ 1(2)(3) _ 
6 eT eh 


1?. 


Section 5.4 MATHEMATICAL INDUCTION 259 


¢ Now for the induction step, assume 


1I)Qn+1 
PEP pe py MED 


(5.10) 
This is the induction hypothesis. We must show that the equation holds for 
n+ 1. Adding (n+ 1)? to both sides of (5.10) gives 


_ (NOt 7H-46 

_ (at Nat 2)(2n + 3) 

_ + Xn H+ eins + D 
6 


P42 te-tn?t(nt1P = +(n+1) 


M@ EXAMPLE 5.4.3 
Let x be a positive rational number. To prove that for any k € Z*, 
(x +1) > xk 41, 
we proceed by mathematical induction. 
e(x+Di=x4+1l=x! 41. 


¢ Let € Z* and assume that (x + 1)" > x" + 1. By multiplying both sides 
of the given inequality by x + 1, we have 
(x + 1)"(« + 1) > (x" +1)(x 4+ 1) 
=x" a xtaxt 
ae er ee 
The first inequality is true by induction (that is, by appealing to the induc- 


tion hypothesis) and since x + | is positive, and the last one holds because 
x>0. 


WM EXAMPLE 5.4.4 
Recursively define a sequence of numbers, 


ay= 3, 
a, = 2a,_, for all n € Z such that n > 1. 


260 Chapter 5 AXIOMATIC SET THEORY 


The sequence is 
a= 3, ay = 6,4, = 12, a, = 24, as = 48, Late 


and we conjecture that a, = 3 - 2*~! for all positive integers k. We prove this by 
mathematical induction. 


+ For the basis case, a; = 3 - 2° = 3. 
¢ Let > 1 and assume a, = 3 -2"-!. Then, 


Gp = Ca = 06322 = 38", 


WM EXAMPLE 5.4.5 


Use mathematical induction to prove that k? < k! for all integers, k > 6. 


¢ First, show that the inequality holds for n = 6: 


6 = 216 < 720 =6!. 


« Assume n>? < n! with n > 6. The induction hypothesis yields three in- 
equalities. Namely, 
Bn? <n-n? <nl, 


3n<n-n<n <n, 


and 
1l<al. 


Therefore, 

(n+ 1p = 4+3n? 4+3n4+1 
<altn!+n!4+n! 
= 4n! 
<(n+ I)n! 
=(n+ 1)!. 


Combinatorics 


We now use mathematical induction to prove some basic results from two areas of 
mathematics. The first is combinatorics, the study of the properties that sets have 


based purely on their size. 


A permutation of a given set is an arrangement of the elements of the set. For 
example, the number of permutations of {a, b,c, d,e, f } is 720. If we were to write all 


of the permutations in a list, it would look like the following: 


Section 5.4 MATHEMATICAL INDUCTION 261 


a b de f 

b d fe 
a b e d f 
ft et o¥ a b 
f e dc b oa 


We observe that 6! = 720 and hypothesize that the number of permutations of a set 
with k elements is k!. To prove it, we use mathematical induction. 


¢ There is only one way to write the elements of a singleton. Since 1! = 1, we have 
proved the basis case. 


e Assume that the number of permutations of a set with n > 1 elements is n!. Let 
A = {aj,d9,...,@,,,} be a set with n+ 1 elements. By induction, there are n! 
permutations of the set {a,,a,...,a,}. After writing the permutations in a list, 
notice that there are n + 1 columns before, between, and after each element of 
the permutations: 


ay a4 ay, 
ay ay Gn-1 
an an-1 a 


To form the permutations of A, place a,,,, into the positions of each empty col- 
umn. For example, if a,,,, is put into the first column, the following permutations 
are obtained: 


Since there are n! rows with n + 1 ways to add a,,,; to each row, we conclude 
that there are (n + 1)n! = (n + 1)! permutations of A. 


This argument proves the first theorem. 
@ THEOREM 5.4.6 
Let n € Zt. The number of permutations of a set with n elements is n!. 


Suppose that we do not want to rearrange the entire set but only subsets of it. For 
example, let A = {a,b,c,d,e}. To see all three-element permutations of A, look at the 


262 Chapter 5 AXIOMATIC SET THEORY 


following list: 
abc acb bac bea cab _ cba 
abd adb bad bda dab dba 
abe aeb bae bea eab_ eba 
acd adc cad cda dac dca 
ace aec cae cea eac eca 
ade aed dae dea ead eda 
bed bdc cbhd cdb dbc_ dcb 
bce bec ceb che ebc ecb 
bde bed dbe deb ebd edb 
cde ced dce dec ecd edc 


There are 60 arrangements because there are 5 choices for the first entry. Once that is 
chosen, there are only 4 left for the second, and then 3 for the last. We calculate that as 


eye ee 


60=5-4-3= = ; 
Pea (5 — 3)! 


Generalizing, we define for all n,r € a, 


n! 
P. = ———_, 
mr (n—r)! 


and we conclude the following theorem. 


M@ THEOREM 5.4.7 


Let r,n € Z*. The number of permutations of r elements from a set with n 
elements is ,,P.. 


Now suppose that we only want to count subsets. For example, A = {a, b,c, d,e} 
has 10 subsets of three elements. They are the following: 


{a,b,c} {a,b,d} {a,b,e} {a,c,d} {a,c,e} 
{a,d,e} {b,c,d} {b,c,e} {b,d,e} {c,d,e} 


The number of subsets can be calculated by considering the next grid. 


abc acb bac bca cab cba 
abd adb bad bda dab dba 
abe aeb bae bea eab_ eba 
acd adc cad cda dac dca 
ace aec cae cea eac eca 
ade aed dae dea ead eda 
bed bdc cbd cdb dbc_ dcb 
bce bec ceb che ebc_ ecb 
bde bed dbe deb ebd edb 
cde ced dce dec ecd edc 


10 rows 


3! columns 


Section 5.4 MATHEMATICAL INDUCTION 263 


There are ; P; permutations with three elements from A. They are found as the entries 
in the grid. However, since we are looking at subsets, we do not want to count abc 
as different from acb because {a, b,c} = {a,c,b}. For this reason, all elements in any 
given row of the grid are considered as one subset. Each row has 6 = 3! entries because 
that is the number of permutations of a set with three elements. Hence, multiplying the 
number of rows by the number of columns gives 


5P; = 10(3)). 
Therefore, 
P. 
Seok 2 5! = 5! ~ 10, 
3! 315-3)! 312! 


A generalization of this calculation leads to the formula for the arbitrary binomial 


coefficient, 
n\ _ n! 
r n(n—ryv 


where n,r € @. Read (”) as “n choose r.” A generalization of the argument leads to 
the next theorem. 


M@ THEOREM 5.4.8 


Let n,r € Zt. The number of subsets of r elements from a set with n elements 
is (): 
When we expand (x + 1)3, we find that 
3 
41 s843743241=) (? Jn, 


r=0 is 


To prove this for any binomial (x + y)”, we need the following equation. It was proved 
by Blaise Pascal (1653). The proof is Exercise 7. 


@ LEMMA 5.4.9 [Pascal’s Identity] 


Oa) 
r r-1 r 
@ THEOREM 5.4.10 [Binomial Theorem] 

Let n € Z*. Then, 


Ifn,r € wso that n > r, 


(x+y)"= >, Ore 


264 Chapter 5 AXIOMATIC SET THEORY 


e Assume for k € ZT, 
k 


(x + yy = > ee 


r=0 


Then, 


k 
(ty = (e+ yet y*=e+y) (‘) fs ve 
r=0 i 


Multiplying the (x + y) term through the summation yields 
k k 
>: (‘) xk ly + > ) xkoryr tl, 
r=0 r r=0 is 


Taking out the (k + 1)-degree terms and shifting the index on the second sum- 
mation gives 


k k 

k+l y K\ k-rtl or y k k-rtl op. ntl 

x + x + x + 3 
r=] G r=1 ce ) : 


which using Pascal’s identity (Lemma 5.4.9) equals 


and this is 


Euclid’s Lemma 


Our second application of mathematical induction comes from number theory. It is the 
study of the greatest common divisor (Definition 3.3.11). We begin with a lemma. 


@ LEMMA 5.4.11 
Let a,b,c € Z such that a #4 0 or b 4 0. If a | be and gced(a, b) = 1, then a | c. 


Section 5.4 MATHEMATICAL INDUCTION 265 


PROOF 
Assume a | be and gcd(a, b) = 1. Then, bc = ak for some k € Z. By Theo- 
rem 4.3.32, there exist m,n € Z such that 


1=ma+nb. 
Therefore, a | c because 
c=cma+cnb=cma+nak = a(cm+nk). @ 
Suppose that p € Z is a prime (Example 2.4.18) that does not divide a. We show 
that gcd(a, p) = 1. Take d > O and assume d | a and d | p. Since p is prime, d = | or 


d = p. Since p { a, we conclude that d must equal 1, which means gced(a, p) = 1. Use 
this to prove the next result attributed to Euclid (Elements VII.30). 


M@ THEOREM 5.4.12 [Euclid’s Lemma] 


An integer p > 1 is prime if and only if p | ab implies p | a or p | 6 for all 
abe Z. 


PROOF 
¢ Let p be prime. Suppose p | ab but p { a. Then, ged(a, p) = 1. Therefore, p | b 
by Theorem 5.4.11. 


¢ Let p > 1. Suppose p satisfies the condition, 
VaVb(p | ab > p| av p|b). 


Assume p is not prime. This means that there are integers c and d so that p = cd 
with 1 <c <d < p. Hence, p | cd. By hypothesis, p | c or p | d. However, 
since c,d < p, pcan divide neither c nor d. This is a contradiction. Hence, p 
must be prime. 


Since 6 divides 3 - 4 but 6 { 3 and 6 { 4, the lemma tells us that 6 is not prime. On 
the other hand, if p is a prime that divides 12, then p divides 4 or 3. This means that 
p=2orp=3. 

The next theorem is a generalization of Euclid’s lemma. Its proof uses mathematical 
induction. 


@ THEOREM 5.4.13 


Let p be prime and a; € Z fori = 0,1,...,n—1. If p | aga, ---a,_}, then p | a; 
for some j = 0,1,...,n-—1. 


PROOF 
¢ The case when n = | is trivial because p divides ay by definition of the product. 


¢ Assume if p | aga, ---a,_;, then p | a; for some j = 0,1,...,2 — 1. Suppose 
P| aa,-+-a,. Then, by Lemma 5.4.12, 


P| aa, -°-a,_1 Or p|a,. 


266 Chapter 5 AXIOMATIC SET THEORY 


If p | a,, we are done. Otherwise, p divides aga, ---a,_,. Hence, p divides one 
of the a; by induction. 


Exercises 
1. Letn € Z™. Prove. 
1 
(a) 142434+-:-+¢n= me ) 


(b) 14+3454+---+(2n-l) =r 


(c) (249245) P pOn ya ee 


3 
2 
(d) P4243 gnt a [AED 
_ pnt+l 
(ec) Pert te tes (#1) 
(f) 1-1!42-2!4---¢n-nl=(ntD!-1 
1 2 n 1 
Bee STA aged Noel eee Pome NS ee 
Choy ap? Gap (n+ 1)! 
! 
(hy OSG 104 oe (4n—2) = 2 
n! 
2. Prove for all positive integers n. 
n 
n(n+ 1)(n +2) 
1) = —————_. 
@) YiG+) ; 
i=1 
“ 1 n 
b eh any ean 
(b) 2» ana) 2n+1 
3. Letn € Z*. Prove. 
(a) n<2" 
(b) n! <n" 
n 
1 1 
(c) ee 
1 2 3 n n 
Cee eee Ene 
Oe hast Spee a 


4. Letn € Z. Prove. 
(a) n? <2" foralln>5. 
(b) 2” <n! foralln > 4. 
(c) n* <n! foralln> 4. 


5. Forn € Z*, prove that if A has n elements, P(A) has 2” elements. 
6. Prove that the number of lines in a truth table with ” propositional variables is 2”. 


7. Demonstrate Pascal’s identity (Lemma 5.4.9). 


Section 5.4 MATHEMATICAL INDUCTION 267 


8. For all integers n > r > 0, prove the given equations. 


(0) 
© (G2) 


9. Letn,r € Zt with n > r. Prove. 


r r+ n\_ {n+l 
Oe ea G a) 
(b) Pa Paste an- v= (*F1) 


10. Let n > 2 be an integer and prove the given equations. 
n 
(a) > (") =n2""! 
r=1 if 
n 
(b) Yeur(") =0 
r 
r=1 


11. Let n and r be positive integers and n > r. Use induction to show the given 
equations. 


©) er) Ch) 


(b) 1423745) bee One 1) = Gay, 


12. Prove the following for all n € o. 
(a) 5|m—n 
(b) 9] n> 4+(n4+ 1) +(n +253 
(c) 8|57°+7 
(d) 5 | 33nt1 + gat 


13. If p is prime and a and b are positive integers such that a + b = p, prove that 
gcd(a, b) = 1. 


14. Prove for alln € Z*, there exist n consecutive composite integers (Example 2.4.18) 
by showing that (n + 1)! +2,(n+ 1)!4+3,...,(@7+ 1)! ++ 1 are composite. 


15. Prove. 
(a) Ifa #0, then a- gcd(b,c) = gcd(ab, ac). 
(b) Prove if gcd(a;,b) = 1 fori = 1,...,n, then gcd(a, - ay---a,,b) = 1). 


16. Fork € Z*, let ap, ay,...,a,_; © @, not all equal to zero. Define 
g= gcd(dp, ay, Rae) »An_4) 


to mean that g is the greatest integer such that g | a; for alli = 0,1,...,k—1. Assuming 
k > 3, prove the given equations. 


268 Chapter 5 AXIOMATIC SET THEORY 


(a) gcd(ao, Qj,-++5 ax-1) = gcd(ao, Ajy-++5 ay-3, ZCd(a,_», ay-1)) 
(b) ged(cag, cay,...,ca,_,) = ¢ gcd(ap, ay, ... ,a,_1) for all integers c # 0 


5.5 STRONG INDUCTION 


Suppose we want to find an equation for the terms of a sequence defined recursively in 
which each term is based on two or more previous terms. To prove that such an equation 
is correct, we modify mathematical induction. Remember the domino picture that we 
used to explain how mathematical induction works (page 258). The first domino is 
tipped causing the second to fall, which in turn causes the third to fall. By the time the 
sequence of falls reaches the n + 1 domino, n dominoes have fallen. This means that 
sentences p(1) through p(n) have been proved true. It is at this point that p(m + 1) is 
proved. This is the intuition behind the next theorem. It is sometimes called strong 
induction. 


@ THEOREM 5.5.1 [Mathematical Induction 2] 
Let p(k) be a formula. For any ng € Z, if 
D(Iy)AVK(kKE@AWILE @A[L < k > ping + )])) > pig tk + 1), 
then 
Vk[kKE ZAK = ng > plk)]. 
PROOF 
Assume p(n) and 
P(N) A Pig + I A+++ A pig +k) > p(ng9n HK +1) (5.11) 
for k € w. Define 
glk) := p(ng) A p(ng + 1) A+++ A png +k). 
We proceed with the induction. 
e Since p(ng) holds, we have q(0). 


e Assume q(n) with n > 0. By definition of q(n), we have p(ng) through 
p(n +n). Thus, p(ng + + 1) by (5.11) from which q(n + 1) follows. 


Therefore, q(n) is true for all n € w (Theorem 5.4.1). Hence, p(n) for all integers 
n>no.@ 
Fibonacci Sequence 


Leonardo of Pisa (known as Fibonacci) in his 1202 work Liber abaci posed a problem 
about how a certain population of rabbits increases with time (Fibonacci and Sigler 
2002). Each rabbit that is at least 2 months old is considered an adult. It is a young 


Section 5.5 STRONG INDUCTION 269 


rabbit if it is a month old. Otherwise, it is a baby. The rules that govern the population 
are as follows: 


¢ No rabbits die. 
e The population starts with a pair of adult rabbits. 
e Each pair of adult rabbits will bear a new pair each month. 


The population then grows according to the following table: 


Month Adult Pairs Young Pairs Baby Pairs 
1 1 0 1 
2 1 1 1 
3 2 1 2 
4 3 2 3 
5 5 3 3 
6 8 5 8 


It appears that the number of adult (or baby) pairs at month n is given by the sequence, 
1,1,2,3,5,8, 13,21, 34,.... 


This is known as the Fibonacci sequence, and each term of the sequence is called a 
Fibonacci number. Let F,, denote the nth term of the sequence. So 


Fy = 1, Fy = 1, F; =2,F4 = 3, Fs = 5, F6 = Sian 
Each term of the sequence can be calculated recursively by 


F, = 1, 
B= ts (5.12) 
F, = Fy + F,_> (n > 2). 


Since we have only checked a few terms, we have not proved that F,, is equal to the 
number of adult pairs in the nth month. To show this, we use strong induction. Since 
the recursive definition starts by explicitly defining F, and F5, the basis case for the 
induction will prove that the formula holds for n = 1 and n = 2. 


e From the table, in each of the first 2 months, there is exactly one adult pair of 
rabbits. This coincides with F, = 1 and F, = 1 in (5.12). 


¢ Let n > 2, and assume that F, equals the number of adult pairs in the kth month 
for all k < n. Because of the third rule, the number of pairs of adults in any 
month is the same as the number of adult pairs in the previous month plus the 


270 Chapter 5 AXIOMATIC SET THEORY 


number of baby pairs 2 months prior. Therefore, 


# of adult pairs # of adult pairs # of young pairs 
=|". le ia 
in monthn + 1 in month n in month n 
_| # of adult pairs 4| # of baby pairs 
in month n in month n — 1 
_| # of adult pairs 4 # of adult pairs 
in month n in month n — 1 
=F n + F n—-1 
= fF n4+1- 


It turns out that the Fibonacci sequence is closely related to another famous object 
of study in the history of mathematics. Letting n > 1, define 


= Fast 
n F, ™ 
The first seven terms of this sequence are 
a,=1/1=1, 
ad, = 2/1 =2, 
a; = 3/2=155, 
a, = 5/3 & 1.667, 
as = 8/5 = 1.6, 


dg = 13/8 = 1.625, 
a, = 21/13 ~ 1.615. 


This sequence has a limit that we call c. To find this limit, notice that 


Fait = Fi, + Fy-1 =e: Fi 
5 f, n 


Because a,_; = F,,/F,_; whenn > 1, 


1 
a, =1+ ; 
Aan-| 
and therefore, 
1 
a,—-1- = 0. 
Gy-} 
Because 
lim a, = lim a =F 
noo" n>oo n-l 4 


we conclude that 
r—r—-1=0. (5.13) 


Section 5.5 STRONG INDUCTION 271 


Therefore, (1 + /5)/2 are the solutions to (5.13), but since F,,,,/F, > 0, we take the 
positive value and find that 
1+ V5 


2 


The number 7 is called the golden ratio. It was considered by the ancient Greeks to 
represent the ratio of the sides of the most beautiful rectangle. 


M@ EXAMPLE 5.5.2 
Prove F,, < c”~! when n > 2 using strong induction. 
¢ Since F, = 1 <r! © 1.618, the inequality holds for n = 2. 


¢ Let n > 3, and assume that F, < r'~! for all k such that 2 < k <n. 


Because t~! = (1/5 — 1)/2 and r~2 = (3 — 5)/2, 
or 


Therefore, the induction hypothesis gives 


F, 


=F thst tate ltr yar", 


Unique Factorization 


Theorem 5.4.13 states that if a prime divides an integer, it divides one of the factors of 
the integer. It appears reasonable that any integer can then be written as a product that 
includes all of its prime divisors. For example, we can write 126 = 2-3 -3-7, and this 
is essentially the only way in which we can write 126 as a product of primes. All of 
this is summarized in the next theorem. It is also known as the fundamental theorem 
of arithmetic. It is the reason the primes are important. They are the building blocks 
of the integers. 


@ THEOREM 5.5.3 [Unique Factorization] 


If n > 1, there exists a unique sequence of primes py < py <--: < py, (kK € @) 
such that n = pop) -*: Px: 


PROOF 
Prove existence with strong induction on n. 


e When n = 2, we are done since 2 is prime. 


e Assume that k can be written as the product of primes as described above 
for all k such that 2 < k < n. Ifn is prime, we are done as in the basis 
case. So suppose n is composite. Then, there exist integers a and b such 
that n = ab and | < a <b < n. By the induction hypothesis, we can write 


a4=49 °° ° Quy 


272 Chapter 5 AXIOMATIC SET THEORY 


and 
b=Prory- + Tp 


where the g; and r; are primes. Now place these primes together in increas- 
ing order and relabel them as 


Pos Pers Sy 
with k =u +v. Then, n = pop, -: - p;, as desired. 


For uniqueness, suppose that there are two sets of primes 
Po SP) S++ Spy and gg SQ, S °° SY 


so that 
N= PoPi*** P= 49491" °° 4- 


By canceling, if necessary, we can assume the sides have no common primes. If 
the cancellation yields 1 = 1, the sets of primes are the same. In order to obtain a 
contradiction, assume that there is at least one prime remaining on the left-hand 
side. Suppose it is pg. If the product on the right equals 1, then pp | 1, which is 
impossible. If there are primes remaining on the right, py divides one of them 
by Lemma 5.4.12. This is also a contradiction, since the sides have no common 
prime factors because of the cancellation. Hence, the two sequences must be the 
same. 


Unique Factorization allows us to make the following definition. 


M@ DEFINITION 5.5.4 


Letn € Z*. If po, py, -.. Py, are distinct primes and rg, rj, ...,7,_1 are natural 
numbers such that 


2 ti Tk-1 
N= Po Py “Peas 


then Py py oo fe | is called a prime power decomposition of n. 


WM EXAMPLE 5.5.5 


Consider the integer 360. It has 2? -32-5! as a prime power decomposition. If the 
exponents are limited to positive integers, the expression is unique. In this sense, 
we can say that 2? - 37 - 5! is the prime power decomposition of 360. However, 
there are times when primes need to be included in the product that are not factors 
of the integer. By setting the exponent to zero, these primes can be included. For 
example, we can also write 360 as 23 - 37-5! - 79. 


WM EXAMPLE 5.5.6 


Suppose n € Z such that n > 1. Use unique factorization (Theorem 5.5.3) 
to prove that n is a perfect square if and only if all powers in a prime power 
decomposition of n are even. 


Section 5.5 STRONG INDUCTION 273 


¢ Let n be a perfect square. This means that n = k* for some integer k > 1. 
Write a prime power decomposition of k, 


PO gtd ri-1 


k= py Py ae Jane 


Therefore, 
— 72 — 20771 2rj-| 
n=k° =p, Dye 


e Assume all the powers are even in a prime power decomposition of n. 
Namely, 
n = pp! “pi 
01 I-1? 
where there exists u; € Z so that r; = 2u,; fori = 0,1,...,/— 1. Thus, 


2ug 2uy 2uy_1 Uy uy 


— — uj_1\2 
M=Po Py o°PI-4 = (19°? apa) , 


a perfect square. 


Exercises 


1. Given each recursive definition, prove the formula for a,, holds for all positive inte- 
gers n. 


(a) Ifa, =—1 anda, = —a,_,, thena, = (-1)”. 
(b) Ifa, =1 anda, = 1/3a,_), then a, = (1/3)""!. 
(c) Ifa, =0, a, = —6, anda, =5a,_,; — 6a,_», then 


a, =3+2"-2-3". 
(d) Ifa, =4, ay = 12, and a, = 4a,_) — 2a,_, then 
a, = (2+ V2)" +(2— V2)". 
(e) Ifa; =1, a) =5,anda, =a, +2a,_, for all n > 2, then 
a, = 2" +(-1". 


() Ifa, =3, a, =—3, a; =9, anda, = a,_) + 4a,_> — 4a,_3, then 


a, = 1-(-2)". 
(g) Ifa, = 3, ay = 10, a3 = 21, anda, = 3a,_, — 3a,_7 + a,_3, then 


a, =n+2n’. 


2. Let g) =a, g, = 5, and g, = g,_1 + &,_2 for all n > 2. This sequence is called the 
generalized Fibonacci sequence. Show that g, =af,»+5f,,_; forall > 2. 


274 Chapter 5 AXIOMATIC SET THEORY 


3. Letn > 0 be an integer. Prove. 
(a) Fys2 > qn 


n 
() = Fu-1 
i=1 


4. Prove that Theorem 5.5.1 implies Theorem 5.4.1. 


tT" —o" 


5 


5. Letto=(1- 5)/2 and demonstrate that F,, = 


6. Letn > 1 anda € Z. Prove. 
(a) a! ~1 =(a4 1\(a"—- 1)-a(a""! - 1). 
(b) a*?-1=(a—1)(a"!4+ a"? +---+a41). 


7. For all n € @, prove that 12 divides n* — n? (Definition 2.4.2). 


8. Assume e | a and e | b. Write prime power decompositions for a and b: 


ro re-1 


en i sas 


and 
— 70 n 51... pSk-l 
b= Py Py °°" Py-y: 
Prove that there exist tg, t),...t,_, © @ such that 


fo th. TKI 


e€= PoP} “"Py_p 
t; <r;,andt; < s; foralli=0,1,...,k—1. 
9. Prove that a® | b* implies a | b for all a,b € Z. 


10. Let a € Z+. Let a have the property that for all primes p, if p | a, then p* | a. 
Prove that a is the product of a perfect square and a perfect cube. 


11. Prove that gcd(F,,, F,49) = 1 for alln € Z*. 


5.6 REAL NUMBERS 


As Z is defined using @ and Q is defined using Z, the set analog to R is defined using 
Q. We start with a definition. 


@ DEFINITION 5.6.1 
Let (A, =) be a poset. The set B is an initial segment of A when B C A and 
for alla,b € A, ifa =< bandbe B, thena € B. (5.14) 
An initial segment B of A is proper if B ¥ A. 
The condition (5.14) is called downward closed. Notice that for all a € R, both 


(—co, a) and (—oo, a] are initial segments of (R, <). A poset is an initial segment of 
itself, but it is not proper. 


Section 5.6 REAL NUMBERS 275 


Hi DEFINITION 5.6.2 
Let (A, =) be a poset with b € A. Define 


seg.(A,b) = {ac A:a< Dd}. 


For example, 
seg<(R, 5) = (—00, 5) 


in (R, <), and 
seg<(Z, 5) = {...,0,1,2,3,4} 


in (Z, <). Both of these are proper initial segments. Notice that every initial segment 
of Z is of the form seg_(Z, n) for some n € Z. 

Neither Z nor Q are well-ordered by <. Ifa poset is well-ordered, its initial segments 
have a particular form. 


@ LEMMA 5.6.3 


If (A, =) is a well-ordered set and B is a proper initial segment of A, there exists 
a unique m € A such that B = seg_.(A, m). 


PROOF 
Suppose that (A, =) is well-ordered and B C A is a proper initial segment. First, 
note that A \ B is not empty since B is proper. Thus, A \ B contains a least 
element m because = well-orders A. 


e Leta € B, which implies that a < m. Otherwise, m would be an element of 
B because B is downward closed. Hence, a € seg (A, m), which implies 
that B C seg_(A, m). 


e Conversely, take a € seg (A, m), which means a < m. If a € A \ B, then 
m = a because m is the least element of A \ B, so a must be an element of 
B by the trichotomy law (Theorem 4.3.21). Thus, seg (A, B) € B. 


To prove uniqueness, let m’ € A such that m # m’ and B = seg_(A,m’). If 
m < m', then m € seg_.(A,m’), and if m! < m, then m! € B = seg_.(A, m). Both 
cases are impossible. Hi 

Dedekind Cuts 

A basic property of R is that it is complete. This means that 


every nonempty set of real numbers with an upper bound 
has a real least upper bound 


(Definition 4.3.12). For example, the set 


A={1-+:nezt} 
n 


276 Chapter 5 AXIOMATIC SET THEORY 


is bounded from above, and its least upper bound is 1. Also, 
B= {3,3.1, 3.14, 3.141, 3.1415, 3.14159, ... } 


is bounded from above, and its least upper bound is z. Observe that both A and B are 
sets of rational numbers. The set A has a rational least upper bound, but B does not. 
This shows that the rational numbers are not complete. Intuitively, the picture is that 
of a number line. If R is graphed, there are no holes because of completeness, but if 
Q is graphed, there are holes. These holes represent the irrational numbers that when 
filled, complete the rational numbers resulting in the set of reals. We use this idea to 
construct a model of the real numbers from @ (Dedekind 1901). 

HM DEFINITION 5.6.4 


A set x of rational numbers is a Dedekind cut (or a real number) if x is a subset 
of (Q, <) such that 


e x is nonempty, 

e x 1s a proper initial segment of Q , 

e x does not have a greatest element. 
Denote the set of Dedekind cuts by R. 


By Lemma 5.6.3, some Dedekind cuts are of the form seg.(Q, a) for some a € Q. In 
this case, write a = seg_(Q, a). Therefore, @ can be embedded in R using the function 
f : Q- R defined by f(a) = a (Exercise 14). The elements of R \ ran(/) are the 
irrational numbers. 


WM EXAMPLE 5.6.5 
e The Dedekind cut that corresponds to the integer 7 is 
7 = seg.(Q, 7). 
Note that there is no gap between 7 andQ\7= {ae Q:a>7}. 
¢ The Dedekind cut x that corresponds to z includes 
{3, 3.1, 3.14, 3.141, 3.1415, 3.14159, ... } 


as a subset. Notice that z ¢€ x and z ¢ Q \ x because z is not rational. This 
means that we imagine a gap between x and Q\x. This gap is where z is located. 


Imagine that only the rational numbers have been placed on the number line (Sec- 
tion 5.3). For every Dedekind cut x, call the point on the number line where x and 
Q\x meet a cut. Following our intuition, if the least upper bound of x is an element of 
Q \ x, there is a point at the cut [Figure 5.1(a)]. This means that x represents a rational 
number, a point already on the number line. However, if the least upper bound of x is 


Section 5.6 REAL NUMBERS 277 


x Q\x x Q\x 


The cut The cut 


(a) A rational number (b) An irrational number 


Figure 5.1 The cuts of two types of numbers. 


not an element of @ \ x, there is no point at the cut [Figure 5.1(b)]. This means that 
X represents an irrational number. To obtain all real numbers, a point must be placed 
at each cut without a point, filling the entire number line. Therefore, the first step in 
showing that R is a suitable model for R is to prove that every set of Dedekind cuts 
with an upper bound must must have a least upper bound that is a Dedekind cut. That 
is, R must be shown to be complete. To accomplish this, we first define on order on R. 


HM DEFINITION 5.6.6 


Let x,y € R. Define x < y if and only if x C y, and define x < y to meanx < y 
and x # y. 


For example, 3 < 4 in R because 3 = seg .(Q, 3) and 4 = seg (Q, 4). Because the order 
on Q is linear (Theorem 5.3.8), it is left to Exercise 5 to prove that we have defined a 
linear order on R. 


@ THEOREM 5.6.7 


(R, <) is a linear order but it is not a well-order. 
Now to prove that R is complete using the order of Definition 5.6.6. 
THEOREM 5.6.8 

Every nonempty subset of R with an upper bound has a real least upper bound. 


PROOF 
Let ¥ # Gand F CR. Letm € R bean upper bound of ¥. By Example 4.3.14, 
L F is the least upper bound of ¥. We show that L) F € R. 


¢ Take x € F. Since Dedekind cuts are nonempty, x # ©, which implies 
that _) F #2. 


¢ By hypothesis, x € m for all x € ¥. Hence, |) F Cm. Sincem € R, we 
have thatm # Q,soU) F¥ c Q. 


« Let x € UF andy < x. Thus, x € a for some Dedekind cut a € F. 
Since a is downward closed, y € a, so y € U F. Hence, with the previous 
part, J F is a proper initial segment of Q. 


278 Chapter 5 AXIOMATIC SET THEORY 


e Letx € a € F. Since a is a Dedekind cut, it has no greatest element. 
Thus, there exists y € a such that x < y. Because y € |) F, we see that 
) F has no greatest element. 


Since every nonempty bounded subset of real numbers has a least upper bound in R, the 
set of Dedekind cuts does not have the same issue with gaps as Q does. For example, 
the least upper bound of the set of real numbers 


{2, 2.7, 2.71, 2.718, 2.7182, 2.71828, ... } 


is the Dedekind cut that corresponds to e € R. 


Arithmetic 


As with the other sets of numbers that have been constructed using the axioms of ZFC, 
we now define addition and multiplication on R. First, define the following Dedekind 
cuts: 


° 0 = seg.(Q, 0) 
¢ 1=seg.(Q, 1). 


Let x,y € R. That 
S={atb:aeExarbey} 


is a Dedekind cut is Exercise 3. Assume that 0 < x and 0 < y. We claim that the set 
P,={ab:0<aExa0<bey}vu0 
is a Dedekind cut. 
¢ P, # S because 0 F @. 


e Lettu€ Q\xandveEeQ\y. Let 
u ifud>v, 
m= 
v ifv>u. 
Then, m? ¢ P,,so P, #Q. 


e LetO <ae€xand0 <5) €y, and suppose that w € Q such that w < ab. Hence, 
wb7! < a, which implies that wb7! € x since x is downward closed. Therefore, 
w € P, because 

w = (wb-')b. 


e Again, let0 < a € xand0O < b € y. Since x and y do not have greatest 
elements, there exists u € x and v € y such that a < u and b < v. Then, by 
Exercise 5.3.15(b) 

ab < uv € P,. 


Section 5.6 REAL NUMBERS 279 


Furthermore, if x < 0 and y < 0, define 
Py ={ab:aExAbe€yA0<-adA0<-—b} U0, (5.15) 
if x < 0 and 0 < y, define 
P3={ab:a€xAND<bey}, (5.16) 
or 0 < x andy < 0, define 
P,={ab:0<aeExabey}, (5.17) 


Since P,, P), P3, P,, and S are Dedekind cuts (Exercise 4), we can use them to 
define the two standard operations on R. 


HM DEFINITION 5.6.9 
Let x,y € R. Define 


xty={at+tb:aeExnbey} 


and 
{ab:0<aExA0<beEy}uUN if0<xA0<y, 
ee {ab:aExAbeyAN0<-aA0<-b} U0 ifx<O0Ay<9O, 
: {ab:aExA0<bey} ifx <0A0<y, 
{ab:0<aeExAbey} if0<xAy<0O. 


Since addition and multiplication on Q are associative and commutative, addition 
and multiplication are associative and commutative on R. 


@ THEOREM 5.6.10 


Addition and multiplication of real numbers are associative and commutative. 


PROOF 
Let x, y,z € R. We prove that addition is associative, leaving the rest to Exer- 
cise 7. 


xt+t(yt+z=x+{bt+c:beyAcez} 
=f{atv:aExAve {b+c:beyAcez}} 
={at+(b+c):a€xA(be€yAceEez)} 
={(a+b)+c:(@ExADEYy)AcEz} 
={utc:u€{atb:aeExAbey}AceEz} 
=f{at+tb:aE€xAbey}+z 
=(x+y)+z. 


280 Chapter 5 AXIOMATIC SET THEORY 


The Dedekind cuts 0 and 1 behave as expected. For example, 
0+4={atb:aE0AbeE4} =seg.(Q,4) =4 
and 
1-4={ab:0<aE1A0<bE4}U0= seg. (Q,4) =4. 
These equations suggest the following. 
M@ THEOREM 5.6.11 


R has additive and multiplicative identities. 


PROOF 
Let x € R. We first show that 


x=f{at+b:aExAbEO}=x+0. 


Take u € x. Since x has no greatest element, there exists v € x such that u < v. 
Write u = v + (u—v). Since u — v < 0, we have that u € x + 0. Conversely, let 
b € 0. Since b € 0 implies that b < 0, we have that u + b < u (Exercise 5.3.16), 
and because x is downward closed, u + b € x. 
We next show that 
x=x-l. 


We have two cases to consider. 
¢ Let 0 < x. By Definition 5.6.9, 
x-l={a-b:0<aeExa0<beEl}vd. 


Let 0 <a € x and 0 < b < 1. Then, ab < a (Exercise 5.3.17), so ab € x 
since x is downward closed. Conversely, take a € x. If a < 0, thena € 0, 
and if a = 0, then a = 0-0, so suppose a > 0. Since x has no greatest 
element, there exists u € x such that a < u. This implies that au-! < 1,s0 
a €x-1 because a = u(au7!). 


e Let x < 0 and proceed like in the previous case. Hi 
We leave the proof of the last result to Exercise 10. 
M@ THEOREM 5.6.12 
e Every element of R has an additive inverse. 
e Every nonzero element of R has a multiplicative inverse. 


e The distributive law holds for R. 


Complex Numbers 


The last set of numbers that we define are the complex numbers. 


Section 5.6 REAL NUMBERS 281 


@ DEFINITION 5.6.13 


Define C = R x R to be the set of complex numbers. Denote (a,b) € C by 
a+t bi. 


Observe that the standard embedding f : R — C defined by f(x) = (x, 0) allows us 
to consider R as a subset of C. We will not define an order on C but will define the 
standard two operations. 


Hi DEFINITION 5.6.14 
Leta+bi,c+di€C. 
e (a+ bi)+(c+di) =(at+ec)+(b+d)i. 


e (a+ bi) - (c+ di) = (ac — bd) + (ad + be)i. 
We leave the proof of the following to Exercise 11. 


M@ THEOREM 5.6.15 
© 7? =-14+ 01. 
e Addition and multiplication are associative and commutative. 
¢ Chas additive and multiplicative identities. 
e Every element of C has an additive inverse. 
¢ Every nonzero element of C has a multiplicative inverse. 


e The distributive law holds in C. 


Exercises 


1. Find the given initial segments in the indicate posets. 
(a) seg(Z, 50) from Example 4.3.6 
(b) seg.({0, 1}*, 101010) from Example 4.3.7 
(c) seg-(P(Z), {1,2,3,4}) from Example 4.3.9 


2. Let (A, <) be a poset. Assume that B and C are initial segments of A. Prove that 
BoC is an initial segment of A. Is BUC also an initial segment of A? 


3. Let x,y € R. Prove that S = {a+b:a€xAb €y} is a Dedekind cut. 
4. Prove that P, (5.15), P; (5.16), and Py (5.17) are Dedekind cuts. 

5. Prove Theorem 5.6.7. 

6. Show that every Dedekind cut has an upper bound in R. 

7. Finish the proof of Theorem 5.6.10. 


282 Chapter 5 AXIOMATIC SET THEORY 


8. Prove that 0 and 1 are Dedekind cuts. 

9. Let a,b € R and ab = 0. Prove that a= 0orb=0. 

10. Prove Theorem 5.6.12. 

11. Prove Theorem 5.6.15. 

12. Prove that between any two real numbers is another real number. 
13. Prove that between any two real numbers is a rational number. 


14. Show that Q can be embedded in R by proving that f : Q > R defined by 


f(a) = seg<(Q, a) 


is an order isomorphism preserving < with <. Furthermore, show that both w and Z 
can be embedded into R. 


15. Let f be the function defined in Exercise 14. Show that f(a+ 1) is an upper bound 
of f(a) for alla € Q. 


16. The absolute value function (Exercise 2.4.23) can be defined so that for all x € R, 
|x| = x U —x, where —x refers to the additive inverse of x (Theorem 5.6.12). Let a be 
a positive real number. Prove the following for every x,y € R. 


(a) |—x| = |[xl. 
(BY. (x? = [x/?, 

(c) x < Ixl. 

(d) |xy| = [xl lyl. 


(e) |x| < aif and only if-a<x<a. 
(f) a < |x| if and only if a < x or x < —a. 


CHAPTER 6 


ORDINALS AND CARDINALS 


6.1 ORDINAL NUMBERS 


In Chapter 5, we defined certain sets to represent collections of numbers. Despite being 
sets themselves, the elements of those sets were called numbers. We continue this asso- 
ciation with sets as numbers but for a different purpose. While before we defined a, Z, 
Q, R, and C to represent N, Z, Q, R, and C, the definitions of this chapter are intended 
to be a means by which all sets can be classified according to a particular criterion. 
Specifically, in the later part of the chapter, we will define sets for the purpose of iden- 
tifying the size of a given set, and we begin the chapter by defining sets that are used 
to identify whether two well-ordered sets have the same order type (Definition 4.5.24). 
A crucial tool in this pursuit is the following generalization of Theorem 5.5.1 to well- 
ordered infinite sets. 


THEOREM 6.1.1 [Transfinite Induction 1] 


Let (A, =) be a well-ordered set. If B € A and seg_.(A, x) € B implies x € B 
for all x € A, then A = B. 


283 


A First Course in Mathematical Logic and Set Theory, First Edition. Michael L. O’ Leary. 
© 2016 John Wiley & Sons, Inc. Published 2016 by John Wiley & Sons, Inc. 


284 Chapter 6 ORDINALS AND CARDINALS 


PROOF 
To show that A is a subset of B, suppose that A \ B is nonempty. Since A 
is well-ordered by =, let m be the least element of A \ B. This implies that 
seg (A, m) € B, som € B by hypothesis, a contradiction. 


Note that transfinite induction restricted to @ is simply strong induction (Theorem 5.5.1). 
To see this, let the well-ordered set (A, <) of Theorem 6.1.1 be (@, <). Define the set 
B= {k: p(k)} C @ for some formula p(k). The conditional 


seg-(w,n) C Bone B 
implies p(0) when n = 0 because seg <(@, 0) = @ and implies 
PO) A PU) A+: Ap(n— 1) > p(n) 


when n > 0 because seg<(@, n) =n. 
Our first use of transfinite induction is the following lemma. It uses the terminology 
of Exercise 4.4.32 and is the first of a sequence of lemmas that will play a critical role. 


@ LEMMA 6.1.2 


Let (A, =) be well-ordered. If g : A > A is increasing, then a = (a) for all 
aca. 


PROOF 
Define B = {x € A: x = ¢(x)}, where @ is an increasing function A > A. 
Let seg.(A, a) C B. We note that a is the least element of A \ seg .(A, a). Let 
y € seg_(A,a). This implies that y < g(y) < @(@) by definition of B and 
because y < a. Hence, g(a) € A \ seg (A, a). Thus, a = g(a) and A = B by 
transfinite induction (Theorem 6.1.1). 


@ LEMMA 6.1.3 


For all well-ordered sets (A, <) and (A’, x’), there exists at most one order iso- 
morphism g: A > A’. 


PROOF 
Let g: A > A’ andy: A — A’ be order isomorphisms. Since both g~! and 
y! are order isomorphisms A’ > A (Theorem 4.5.26), y~! og and g7! ow are 
order isomorphisms A — A (Theorem 4.5.27). We note that for every b,c € A, 
if b < c, then g(b) <' ec) and then w—!(g(b)) < w—!(p(c)). This means that 
y! oq@is increasing. A similar argument proves that 7! o y is increasing. To 
show that g = y, let a € A. By Lemma 6.1.2, 


1 


a= (yw! 0 g(a) 


and 
1 


ax(g ow)(a). 
Therefore, y(a) <’ g(a) and g(a) x’ y(a). Since =’ is antisymmetric, we have 
that g(a) = w(a). Hf 


Section 6.1 ORDINAL NUMBERS 285 


@ LEMMA 6.1.4 


No well-ordered set (A, <) is order isomorphic to any of its proper initial seg- 
ments. 


PROOF 
Let (A, =) be a well-ordered set. Suppose that S is a proper initial segment 
of A. In order to obtain a contradiction, assume that g : A — S is an order 
isomorphism. Take a € A\.S. Since y(a) € S and @¢ is increasing, we have that 
a < g(a) < aby Lemma 6.1.2. 


The next result follows from Lemma 6.1.4 (Exercise 1). 
@ LEMMA 6.1.5 

Distinct initial segments of a well-ordered set are not order isomorphic. 
The lemmas lead to the following theorem. 


@ THEOREM 6.1.6 


If (A, <) and (B, =’) are well-ordered sets, there exists an order isomorphism 
such that exactly one of the following holds. 


ALB. 
e Ais order isomorphic to a proper initial segment of B. 


e B is order isomorphic to a proper initial segment of A. 


PROOF 
Let (A, =<) and (B, =’) be well-ordered sets. Appealing to Lemma 6.1.5, if x € A, 
there is at most one y € B such that seg_.(A,x) = seg_,(B, y), so define the 
function 


py ={(%,y)€ AX B: seg_(A, x) & seg_/(B, y)}. 


We have a number of facts to prove. 
¢ Let y,, y. © ran(@) such that y,; = y,. Take x,,x € A such that 
seg_.(A, x;) = seg_/(B, y;) 


and 
seg .(A, x2) & seg_/(B, yp). 


Then, we have seg .(A, x) = seg.(A, x2), and x; = x7 by Lemma 6.1.5. 
Therefore, g is one-to-one. 


¢ Take x;,x € dom(@) and assume that x, =< x. This implies that 


seg (A, x;) € seg_(A, x9). 


286 


Chapter 6 ORDINALS AND CARDINALS 


Then, by definition of g, we have 
seg (A, x)= seg ./(B, @(x,)) 


and 
seg (A, x7) = seg .1(B, Y(x>)). 


Hence, seg_./(B, p(x )) is order isomorphic to an initial segment S' of 
seg_/(B, g(x,)) (Exercise 17). If S 4 seg ./(B, g(x,)), then B has two 
distinct isomorphic initial segments, contradicting Lemma 6.1.5. This im- 
plies that p(x,) <’ p(x>), so @ is order-preserving. 


Let x,,xX. © A. Suppose that x; =< x, and x, € dom(@). This means that 
there exists y. € B such that 


seg (A, X,)& seg./(B, Yo). 


If x; = x5, then x; € dom(@), so assume that x; # x2. Since x; < x, we 
have that x, € seg_.(A, x7). Because @ is order-preserving, 


seg .(A, x,) & seg_/(B, y) 


for some y, € seg./(B, yz) (Exercise 17). Therefore, (x,,y,) € @, so 
x, € dom(@), proving that the domain of ¢ is an initial segment of A. 


That the range of @ is an initial segment of B is proved like the previous 
case. 


If g is a surjection and dom(@) = A, then @ is an order isomorphism A > B, 
else g~![B] is a proper initial segment of A. If g is not a surjection, g[A] is a 
proper initial segment of B. Hi 


Ordinals 


Theorem 6.1.6 is a sort of trichotomy law for well-ordered sets. Two well-ordered sets 
look alike, or one has a copy of itself in the other. This suggests that we should be 
able to choose certain well-ordered sets to serve as representatives of all the different 
types of well-ordered sets. No two of the chosen sets should be order isomorphic, but 
it should be the case that every well-ordered set is order isomorphic to exactly one of 
them. That is, we should be able to classify all of the well-ordered sets. This will be 


our immediate goal and is the purpose behind the next definition. 


@ DEFINITION 6.1.7 


The set a is an ordinal number (or simply an ordinal) if (a, C) is a well-ordered 
set and f = seg-(a, f) for all Bf € a. For ordinals, define 


seg(a, B) = segc(a, f). 


Section 6.1 ORDINAL NUMBERS 287 


Definition 6.1.7 implies that @ and every natural number is an ordinal because they are 
well-ordered by € and for all n € @ \ {0}, 


n= {0,1,2,...,n—1} = seg(@, n), 
and for all k € n, 
k = {0,1,2,...,k — 1} = seg(n, k). 


For example, 5 € 7 and 
5 = {0,1,2,3,4} = seg(7, 5). 


We now prove a sequence of basic results about ordinals. The first is similar to 
Theorem 5.2.8, so its proof is left to Exercise 5. 


M@ THEOREM 6.1.8 


Ordinals are transitive sets. 


M@ THEOREM 6.1.9 


The elements of ordinals are transitive sets. 


PROOF 
Let a be an ordinal and f € a. Take y € f and 6 € y. Since a is transitive 
(Theorem 6.1.8), we have that y € a. Therefore, y = seg(a, y) and 6 = seg(a, f), 
so 


6 © seg(a,y) C seg(a, B) = f, 
which implies that # is transitive (Definition 5.2.7). Hl 


M@ THEOREM 6.1.10 


Every element of an ordinal is an ordinal. 


PROOF 
Let a be an ordinal and f € a. Notice that this implies that # is transitive (The- 
orem 6.1.9). Since 6 C a, we have that (f, C) is a well-ordered set by a subset 
axiom (5.1.8) and Theorem 4.3.26. Now take 6 € f. Since 6 € a, we have that 
6 is transitive. Therefore, by Exercise 5.2.3, 


d={y:y€6} 
={y:y,EBpAyes} 
C{y:yEpaycs} 
= seg(f, 6) 

C seg(a, 6) 

=6. 


From this, we conclude that 6 = seg(f, 6). Hi 


288 Chapter 6 ORDINALS AND CARDINALS 


M@ THEOREM 6.1.11 
Let a and f be ordinals. Then, a C f if and only if a € f. 


PROOF 

Ifa € B, thena C f because f is transitive (Theorem 6.1.8 and Exercise 5.2.3). 
Conversely, suppose that a Cc f. Lety € a andé Cy with 6 € P. Since f is an 
ordinal, 6 = seg(f,6). Hence, 6 = seg(y, 6), which implies that 6 € y because 
y is an ordinal by Theorem 6.1.10. Therefore, 6 € a@ because a is transitive 
(Theorem 6.1.8). This shows that a@ is a proper initial segment of # with respect 
to C (Definition 5.6.1). From this, it follows by Lemma 5.6.3 that a = seg(f, ¢) 
for some € € f. Hence, a € §. 


M@ THEOREM 6.1.12 

Every ordinal is well-ordered by €. 

The next theorem is an important part of the process of showing that the ordinals 
are the sets that classify all well-ordered sets according to their order types. It states 
that distinct ordinals are not order isomorphic with respect to C. 

M@ THEOREM 6.1.13 
For all ordinals @ and f, if (a, C) = (6, G), then a = Pf. 


PROOF 
Let g : a — f# be an order isomorphism preserving C. Define 


A={y ea: Qy)=7}. 
Take 6 € a and assume that seg(a, 6) C A. Then, 
p(5) = seg(B, p(6)) 

= glseg(a, 5)| 

={ply):yEeany co} 

={y:7 C4} 

= 6. 
The first equality follows because g(6) is an ordinal in f, the second follows 
because ¢ is an order isomorphism, and the fourth equation follows by the as- 


sumption. Therefore, by transfinite induction (Theorem 6.1.1), A = a, so @ is 
the identity map and a = f. 


Because of Theorem 6.1.13, we are able to prove that there is a trichotomy law for 
the ordinals with respect to C. 


@ THEOREM 6.1.14 [Trichotomy] 


For all ordinals a and f, exactly one of the following holds: a = f, a C f, or 
ac fp. 


Section 6.1 ORDINAL NUMBERS 289 


PROOF 
Since (a, C) and (f, C) are well-ordered, by Theorem 6.1.6, exactly one of the 
following holds. 


¢ a = B, which implies that a = fp by Theorem 6.1.13. 


e There exists 6 € f such that a = seg(fP,5). Since seg(f, 6) is an ordinal 
(Theorem 6.1.10), a = seg(6,6), again by Theorem 6.1.13. Therefore, 
ac fp. 


e There exists y € a such that f = seg(a,y). As in the previous case, we 
have that 6 Ca. Hf 


Because of Theorem 6.1.11, we can quickly conclude the following. 


HM COROLLARY 6.1.15 


For all ordinals a and f, exactly one of the following holds: a = f, a € f, or 
ae fp. 


In addition to the ordinals having a trichotomy law, the least upper bound with respect 
to C of a set of ordinals is also an ordinal (compare Example 4.3.14). 


M@ THEOREM 6.1.16 


If F is a set of ordinals, |) F is an ordinal. 


PROOF 
We show that |) F satisfies the conditions of Definition 6.1.7. Since the elements 
of ordinals are ordinals, U F is aset of ordinals, and by Theorem 6.1.14, we see 
that (LJ ¥, ©) is a linearly ordered set. Let B C [J F and take a € B. We have 
two cases to consider. 


e Suppose an B= ©. Let Bf € B. Then, f ¢ a, so by Theorem 6.1.11 and 
Theorem 6.1.14, a C f#. Hence, a is the least element of B. 


¢ Let an B be nonempty. Since a is an ordinal, there exists an ordinal 6 that 
is the least element of a MN B with respect to C. Let B € B. If 6 C a, then 
B € aN B, which implies that 6 C f. Also, if a C f, then 6 C f. Since 
these are the only two options (Theorem 6.1.14), this implies that 6 is the 
least element of B. 


We conclude that (|) , C) is a well-ordered set. 

Next, let 6 € L)F. This means that there exists an ordinal a € A such 
that B € a. Since seg(L) F,f) C f by definition, take 6 € f. Since a is 
transitive (Theorem 6.1.8), 6 € a. Therefore, 6 € U F, which implies that 
BC seg(U F, 6). 


290 Chapter 6 ORDINALS AND CARDINALS 


Classification 


Let a be an ordinal number. We check the two conditions of Definition 6.1.7 to show 
that at is an ordinal. 


¢ Let B be a nonempty subset of aU {a}. If Bna # ©, then B has a least element 
with respect to C since (a, C) is well-ordered. If B = {a}, then a is the least 
element of B. 


° Let 6 € aU {a}. If B € a, then Bf = seg(a, Bf) = seg(at, B) because a is an 
ordinal number. Otherwise, f = a = seg(a™, a). 


The ordinal at is called a successor ordinal because it has a predecessor. For example, 
every positive natural number is a successor ordinal. 
Now assume that a # @ and a is an ordinal that is not a successor. 


e Let 6 € Bf Ea. Since a is transitive (Theorem 6.1.8), 6 € a. Thus, we conclude 


that J{P: Pea} Ca. 


¢ Now take 6 € a. This implies that 6 is an ordinal (Theorem 6.1.10), so 6 C a by 
Theorem 6.1.11. Therefore, 5+ C a, so 56+ C @ since a@ is not a successor. Thus, 
6* € a, again by appealing to Theorem 6.1.11. Because 6 € 6+, we have that 


aCU{P: pea}. 
We conclude that for every nonempty ordinal a that is not a successor, 


a=|JP. 


pea 


Such an ordinal number is called a limit ordinal. For example, since every natural num- 
ber is an ordinal, @ = J{n: n € @} isa limit ordinal. Therefore, wt, att, a@ttt, ... 
are also ordinals, but they are successors. 

All of this proves the following. 


@ THEOREM 6.1.17 


A nonempty ordinal is either a successor or a limit ordinal. 


Therefore, by Theorem 6.1.14 and Corollary 6.1.15, we can view the ordinals as sorted 
by C giving 


0clc2c:::-cacatcattcatttc:.:: 
and as sorted by € giving 
0OEle2e::-Enwew' ewmteattt.e..:.. 


Characterizing every ordinal as being equal to 0, a successor ordinal, or a limit 
ordinal allows us to restate Theorem 6.1.1. The form of the theorem generalizes The- 
orem 5.4.1 to infinite ordinals. Its proof is left to Exercise 8. 


Section 6.1 ORDINAL NUMBERS 291 


i THEOREM 6.1.18 [Transfinite Induction 2] 


If a@ is an ordinal and A C a, then A = a if the following hold: 
© 0EA. 
° If 6 € A, then pt € A. 


e If is a limit ordinal such that 6 € A for all 6 € f, then f € A. 


We use this second form of transfinite induction to prove the sought-after classification 
theorem for well-ordered sets. 


@ THEOREM 6.1.19 


Let (A, =) be a well-ordered set. Then, (A, <) & (a, C) for some ordinal a. 


PROOF 
Define 
p(x, y) = y is an ordinal A seg_.(A, x) = y. 

Let B = {y: dx [x € AA p(x, y)]}. By Theorem 6.1.13, p(x, y) defines a func- 
tion, so by a replacement axiom (5.1.9), we conclude that B is a set. We have a 
number of items to prove. 

Let DC Band D # @. LetC = {a € A: dala € DA p(a,a)]}. Observe 
that C is not empty. Therefore, there exists a least element m € C with respect 
to x. Take an ordinal 69 € D such that 


seg_.(A,m) = 6p. 
Let 6 € D. This means that 6 is an ordinal and 
seg (A, c)eé 
for some c € C. Since m = c, we have that 
seg (A, m) C seg (A, c). 


Hence, 69 is isomorphic to a subset of 6, which implies that 6) C 6 (Theo- 
rem 6.1.13). We conclude that (B, C) is a well-ordered set. 
Let E = {fp € B: seg(B, f) = BP}. Let seg(B, €) C E fore € B. 


¢ First, suppose that e = y* for some ordinal y. Then, seg(B, vy) = y, so 


seg(B,y)U{y} =y". 


Also, yt & seg .(A, a) for some a € A. Let m be the greatest element of 
seg (A, a) (Exercise 9). This implies that y = seg_(A, a)\{m}, soy € B. 
Hence, 

seg(B, 7) U {y} = seg(B,y*), 


292 Chapter 6 ORDINALS AND CARDINALS 


and we have € € E. 


¢ Second, lete = U{y : y € €}. This means that seg(B, 7) = forall y Ee. 
Therefore, 


seg(B,¢€) = U seg(B,y) = U Y=€, 


yee yee 


and ¢€ is again an element of E. 


By transfinite induction (Theorem 6.1.18), E = B. This combined with (B, C) 
being a well-ordered set means that B is an ordinal. 

Define g : A > B by G(x) = y & p(x, y). Since @ is an order isomorphism 
(Exercise 10), B is an ordinal that is order isomorphic to (A, =), and because of 
Theorem 6.1.13, it is the only one. 


For any well-ordered set (A, <), the unique ordinal a such that A = @ guaranteed 
by Theorem 6.1.19 is called the order type of A. Compare this definition with Defini- 
tion 4.5.24. For example, the order type of ({2n:n>5Ane€éZ},<)is@. 


Burali-Forti and Hartogs 


Suppose # = {0,4,6,9}. Then, J # equals the ordinal 9, which is the least upper 
bound of &. Also, assume that B = {5, 100, @}. Then, |) & = w. However, the least 
upper bound of @ = {n € w: Ak(k € w@An = 2k)} is not an element of @. Instead, 
the least upper bound of @ is |) @ = . Moreover, notice that # C 10, ® C w*, and 
6 Caw. We generalize this to the next theorem. 


@ THEOREM 6.1.20 


If F is a set of ordinals, there exists an ordinal a such that F C a. 


PROOF 
Take F to be a set of ordinals and let a € F. Then, a C J) F and LJ F is an 
ordinal by Theorem 6.1.16. If a C UJ F, then a € LJ F by Theorem 6.1.11. If 
a =U, then a € {UF}. Thus, ¥ C (UF). # 


Although every set of ordinals is a subset of an ordinal, there is no set of all ordinals, 
otherwise a contradiction would arise, as was first discovered by Cesare Burali-Forti 
(1897). This is why when we noted that C gives the ordinals a linear order, we did not 
claim that C is used to define a linearly ordered set containing all ordinals. 


@ THEOREM 6.1.21 [Burali-Forti] 


There is no set that has every ordinal as an element. 


PROOF 
Suppose ¥ = {a : a is an ordinal} is a set. This implies that (J F is an ordinal 
by Theorem 6.1.16. However, for every a € ¥, we have that a € at € A, 
showing that ¥ C J F. Since U F € F, we also have |) ¥ € LU F, which 
contradicts Theorem 5.1.16. 


Section 6.1 ORDINAL NUMBERS 293 


The Burali-Forti theorem places a limit on what can be done with ordinals. One such 
example is a theorem of Friedrich Hartogs. 


@ THEOREM 6.1.22 [Hartogs] 


For every set A, there exists an ordinal @ such that there are no injections of a 
into A. 


PROOF 
Let A be a set. Define 


& = {a: ais an ordinal A dy(y is an injection a > A)}. 
Notice that for every a € @, there exists a bijection @, such that 
Pai t& > By 
for some B, € A. Define a well-order =, on B, by 
PalPi) Sa Pa( Pz) if and only if 6, C fy for all B,, Py € a. 
Then, @, is an order isomorphism preserving C with <,. Next, define 
F = {(B,<): BC AA < isa well-ordering of B}. 


Since ¥ C P(A) x P(A x A), we have that F is a set by the Power Set Axiom 
(5.1.7) and a Subset Axiom (5.1.8). Let 


P(x, y) = x € F A JAy(y is an order isomorphism x > y 
preserving the order on x with C}. 


Suppose that p((B, <),a,) and p((B, <),a,). By Theorem 6.1.13, we have that 
@; = Q, So p(x, y) defines a function with domain #. Moreover, @ is a subset 
of the range of this function because p((B,, <,), @) due to g,. Therefore, & is a 
set by a replacement axiom (5.1.9) and a subset axiom, and @ cannot contain all 
ordinals by the Burali-Forti theorem (6.1.21). Hi 


Transfinite Recursion 


Theorem 6.1.19 only applies to well-ordered sets, so, for example, it does not apply 
to (Z, <) or (R, <). However, if we change the order on Z from the standard < to = 
defined so as to put Z into this order, 


03.1, —1,2,—2,:3,—3, . 25 


then (Z, =) is a well-ordered set of order type w. That this can be done even with 
sets like R is due to a theorem first proved by Zermelo, which is often called the well- 
ordering theorem. Its proof requires some preliminary work. 

Let A be a set and @ an ordinal. By Definition 4.4.13, “A is the set of all functions 
a — A. Along these lines, define 


<tA = {p: SB(B € a A gis a function Bp > A)}. 


294 Chapter 6 ORDINALS AND CARDINALS 


For example, f, g € <°Z, where 
f = {(0, 1), C, 2), (2, 3), (3, 4)} 


and 
g = {(0, -4), (1, 14)}. 


Also, f,g € <°Z, but the identity function on @ is not an element of <°Z because its 
domain is w. We should also note that for any set A, 


SAS, 
We use this notation in the following generalization of recursion to infinite ordinals. 


@ THEOREM 6.1.23 [Transfinite Recursion] 


Let @ be an ordinal. For every function y : <*A — A, there exists a unique 
function g : a > A such that for every 6 € a, 


Pf) = w(t B). 


PROOF 
To prove uniqueness, in addition to g, let g’ be a function a > A such that for 
all B € a, 
¢'(B) = w(' |p). 
Define B = {8B € a: Gf) = o'(f)}. We use transfinite induction (Theo- 
rem 6.1.1) to show that B = a. Suppose seg(a, 5) € B with 6 € a. That is, 


VBIB € 5 > o(B) = 9'(p)]. 
This implies that g | 6 = g’ | 6. Therefore, 
p(5) = we 1 6) = w(g' | 6) = g'(6), 


so 6 € B, and we conclude that g = 9’. 
We prove existence indirectly. Suppose that @ is the least ordinal (Exercise 2) 
such that 


there exists a function yy : <*A > A such that 
for every g : a > A, there exists B € a 


such that g(f) 4 wo(¢@ | p). 


Since the theorem is trivially true for a = 0, we have two cases to consider. 


¢ Let a = 6+ for some ordinal 6. By minimality of a, we have a function 
@s5 : 6 — A such that for all 6 € 6, 


O3(B) = Wo 5 | B). 


Section 6.1 ORDINAL NUMBERS 295 


Extend g; to @; : a > A by defining @5(6) = wo(@;), so 
P5(5) = Wo(@s | 4). 


This contradicts the minimality of a. 


Let a be a limit ordinal. For each 6 € a, there exists a unique g; :6 > A 
such that 


95(5) = w(@z | 4). 


Notice that 6 € y implies that g, is an extension of @;5, otherwise @, | 6 
would have the property 


(p, | 6)(B) = woll, | 611 B) 


for all B € 6 yet ps # g,|6. This contradicts the uniqueness of 935. 
Therefore, {gs : 6 € a} is a chain, so, as in Exercise 4.4.13, define the 


function g : a > A by 
2 = U P5- 


6€a 


To check that @ is the function given by the theorem, take Pf € a. Since a 
is a limit ordinal, B+ € a and 


PB) = Gy+(B) = Wo(Pp+ 1B) = Wo(Gl A), 
again contradicting the minimality of a. Hi 


Theorem 6.1.23 has a corollary that can be viewed as an extension of Theorem 5.2.14. 
Its proof is left to Exercise 18. 


HM COROLLARY 6.1.24 


Let A be a set anda € A. For every ordinal a, if y is a function A — A, there 
exists a unique function g : a > A such that 


* (0) = 4, 
* 9B) = w(e(a@)) for all B € a, 


© oy) = Ulo(,) : B € 7} for all limit ordinals y € a. 


We are now ready to prove that every set can be well-ordered. The following theorem 
is equivalent to the axiom of choice (Exercise 20). 


HM THEOREM 6.1.25 [Zermelo] 


For any set A, there exists a relation R on A such that (A, R) is a well-ordered 
set. 


296 Chapter 6 ORDINALS AND CARDINALS 


PROOF 
Take a set A and let € : P(A) > A be a choice function (Corollary 5.1.11). By 
Theorem 6.1.22, there is an ordinal a such that no injection a > A exists. Hence, 
we have g : A > a that is one-to-one. Let B C A. Since every element of a is 
an ordinal, there exists an ordinal 6 C a@ such that g[B] = 6 (Theorem 6.1.16). 
Define 


Yp=OlB. 
Then, Va e€ <“A. Let 
P= {w,': BE P(A)}, 


and define 
h,: P > P(A) 


by hy(wz') = B. Also, define 
hy: POA 

by h, = € 0 hy. Since P € <%A, extend h, to some 
h: A 3A 


such that h | P = hy. By transfinite recursion (Theorem 6.1.23), there exists a 
function f : a > A such that for all 6 € a, 


f(B) = HEF TP). 
Define ® so that for all 6 € a, 


O(f) = h(A\ FIZ) if A\ FIA 4 2, 
A if A\ f [pf] =2. 


Notice that A ¢ A by Theorem 5.1.16. Let fp be the least ordinal such that 
(fo) = A. Then, ®} fp is a bijection fy — A [Exercise 19(a)]. Lastly, define 
the relation R on A by 


a Rb if and only if ®-!(a) C O!(b), 


for all a,b € A. Since C is a well-order on fp, R is a well-order on A [Exer- 
cise 19(b)]. Hi 


Exercises 

1. Prove Lemma 6.1.5. 

. Does seg(\ A, fp) = ear seg(a, f) for all sets of ordinals A with 6 € A? Explain. 
. Explain why {0,2,3, 4,5} is not an ordinal. 


Rw oN 


Prove that @ is an element of every ordinal. 


Section 6.1 ORDINAL NUMBERS 297 


5. Prove that an ordinal is a transitive set (Theorem 6.1.8). 


6. Let A bea set of ordinals. Prove. 
(a) (] A is an ordinal. 
(b) A has a least element with respect to C. 


7. Let B be a nonempty subset of the ordinal a. Prove that there exists 6 € B such 
that 6 and B are disjoint. 


8. Prove Theorem 6.1.18. 
9. From the proof of Theorem 6.1.19, prove that seg .(A, a) has a greatest element. 
10. Prove that A is order isomorphic to B in the proof of Theorem 6.1.19. 


11. The proof of Theorem 6.1.19 contains many isomorphisms without explicitly iden- 
tifying the isomorphism. Find these functions and prove that they are order isomor- 
phisms. 


12. Let R be a well-ordering on A and suppose that A has no greatest element. Show 
the the order type of (A, R) is a limit ordinal. 


13. Find a transitive set that is not an ordinal. 


14. Theorem 6.1.21 comes from the Burali-Forti paradox. Like Russell’s paradox 
(page 225), it arises when any formula is allowed to define a set. In this case, suppose 
that A = {@ : ais an ordinal} and assume that A is a set. Prove that A is an ordinal 
that must include all ordinals as its elements. 


15. Prove that there exists a function F such that F(n) is the nth Fibonacci number. 


16. Prove that for every function h : <®A — A, there is a unique function f :@—> A 
such that for alln € w, f(n) = hf [n). 


17. Let (A, =) and (B, =’) be well-ordered sets and g : A > B be an order-preserving 


surjection. Prove that for every a € A, there exists b € B such that 


glseg. (A, a)] = seg ./(B, b). 


18. Prove Corollary 6.1.24. 


19. Prove the following from the proof of Zermelo’s theorem (6.1.25). 
(a) ® | fp is a bijection. 
(b) R is a well-order on A. 


20. Prove that Theorem 6.1.25 implies Axiom 5.1.10. 


21. Prove that Zorn’s lemma (5.1.13) implies Theorem 6.1.25 without using the axiom 
of choice. 


298 Chapter 6 ORDINALS AND CARDINALS 


6.2 EQUINUMEROSITY 


How can we determine whether two sets are of the same size? One possibility is to 
count their elements. What happens, however, if the sets are infinite? We need another 
method. Suppose A = {12,47,84} and B = {17,101,200}. We can see that these 
two sets are the same size without counting. Define a function f : A — B so that 
f(12) = 17, f(47) = 101, and f(84) = 200. This function is a bijection. Since each 
element is paired with exactly one element of the opposite set, A and B must be the 
same size. This is the motivation behind our first definition. 


Ml DEFINITION 6.2.1 


The sets A and B are equinumerous (written as A ~ B) if there exists a bijection 
g:A-— B. If A and B are not equinumerous, write A # B. 


EXAMPLE 6.2.2 
Take n € Z such that n ¥ 0 and define 


nZ = {nk:k EZ}. 


We prove that Z » nZ. To show this, we must find a bijection f : Z > nZ. 
Define f(k) = nk. 


¢ Assume x,,X> € Z, and let f(x,) = f(x2). Then nx, = nx,, which yields 
xX, =X, since n # 0. Thus, f is one-to-one. 


e Let y € nZ. This means that y = nk for some k € Z, so y = f(k). This 
shows that f is onto and, hence, a bijection. 


Mi EXAMPLE 6.2.3 
To see Zt wx Z, define a one-to-one correspondence so that each even integer is 
paired with a nonnegative integer and every odd integer is paired with a negative 
integer (Figure 6.1). Let g : Z+ — Z be defined by 
(n) k-1 ifn=2kforsomek € Z‘, 
n)= 
? -k ifn =2k—1 for some k € Z*. 


Notice that g(4) = 1 since 4 = 2(2), and g(5) = —3 because 5 = 2(3) — 1. This 
function is a bijection (Exercise 12). 


Equinumerosity plays a role similar to that of equality of integers. This is seen in the 
next theorem. In fact, the theorem resembles Definition 4.2.4. Despite this, it does not 
demonstrate the existence of an equivalence relation. This is because an equivalence 
relation is a relation on a set, so, to define an equivalence relation, the next result would 
require a set of all sets, contradicting Corollary 5.1.17. 


Section 6.2 EQUINUMEROSITY 299 


6 7 8 
_-@ ° 

° 

2 3 


Figure 6.1 Z x Zt. 


@ THEOREM 6.2.4 
Let A, B, and C be sets. 


e Aw A. (Reflexive) 

e If Ax B, then B = A. (Symmetric) 

e If Ax Band Bx C, then A w& C. (Transitive) 
PROOF 


e Aw A since the identity map is a bijection. 


« Assume A & B. Then, there exists a bijection g : A > B. Therefore, g~! is a 


bijection. Hence, B = A. 


¢ By Theorem 4.5.23, the composition of two bijections is a bijection. Therefore, 
Aw Band Bx C implies Ax C. Hf 


The symmetric property allows us to conclude that nZ x Z and Z = Z* from 
Examples 6.2.2 and 6.2.3. The transitivity part of Theorem 6.2.4 allows us to conclude 
from this that nZ ~ Z*. 


M@ EXAMPLE 6.2.5 


Show (0, 1) = R. We do this in two parts. First, let f : (0, 1) ~ (—z/2, 2/2) be 
defined by f(x) = xx -—2/2. This function is a one-to-one correspondence since 
its graph is a nonvertical, nonhorizontal line (Exercise 4.5.4). Second, define 
g : (-2/2,2/2) > R to be the function g(x) = tanx. From trigonometry, we 
know that tangent is a bijection on (—2/2, 2/2). Hence, 


(0, 1) © (-2/2, 2/2) 


and 
(-2/2,2/2) xR. 


Therefore, (0, 1) * R by Theorem 6.2.4. 


300 Chapter 6 ORDINALS AND CARDINALS 


Order 
If ~ resembles equality, the following resembles the < relation. 


M DEFINITION 6.2.6 


The set B dominates the set A (written as A < B) if there exists an injection 
g:A-— B. If B does not dominate A, write A 4 B. Furthermore, define A < B 
tomean A < Bbut A ¥# B. 


M@ EXAMPLE 6.2.7 


If A C B, then A < B. This is proved using the inclusion map (Exercise 4.4.26). 
For instance, let: : Zt — Z be the inclusion map and f : Z > R be defined as 


f(n) = seg<(Q, n). 


Then, Z* < R because f 01 is an injection. Similarly, @ < C. However, A c B 
does not imply A # B. As an example, nZ = Z, butnZ C Zwhenn # +1. 


Another method used to prove that A < B is to find a surjection B > A. Consider 
the sets A = {1,2} and B = {3,4,5}. Define f : B > A to be the surjection given by 
fGB) =1, f@ = 2, and f(5) = 2. This is the inverse of the relation R in Figure 4.1. 
To show that B dominates A, we must find an injection A > B. To do this, modify f~! 
by deleting (2, 4) and call the resulting function g. Observe that g(1) = 3 and g(2) = 5, 
which is an injection, so A < B. 


M@ THEOREM 6.2.8 


If there exists a surjection g: A > B, then B < A. 


PROOF 
Let pg: A > Bbe onto. Define a relation R C B x A by 


R= {(6,a) : o(@) = 5}. 


Since @ is onto, 
dom(R) = ran(g) = B. 


Corollary 5.1.11 yields a function f so that dom(f) = dom(R) and f C R. 
We claim that f is one-to-one. Indeed, let b,,b, € B. Assume that we have 
f(b;) = f(by). Let ay = f(b,) and ay = f(b) where a,,a, € A. This means 
a, = a). Also, g(a,) = b, and g(a,) = by because f C R. Since ¢ is a function, 
b, =b). 0 


M@ EXAMPLE 6.2.9 


Let R be an equivalence relation on a set A. The map g : A > A/R defined by 
(a) = [a]p isa surjection. Therefore, A/R < A. 


Section 6.2 EQUINUMEROSITY 301 


Mi EXAMPLE 6.2.10 


We know that Z+ < R by Example 6.2.7. We can also prove this by using the 
function f : R > Z* defined by f(x) = |[|x]]| + 1 and appealing to Theo- 
rem 6.2.8. 


The next theorem states that < closely resembles an antisymmetric relation (Defini- 
tion 4.3.1). Cantor was the first to publish a statement of it (1888). He proved it using 
the axiom of choice, but it was later shown that it can be proved in ZF. It was proved 
independently by Ernst Schroder and Felix Bernstein around 1890. The proof given 
here follows that of Julius Konig (1906). 


@ THEOREM 6.2.11 [Cantor—-Schréder—Bernstein] 
If A< Band B<A,then A wx B. 
PROOF 
Let f : A— Bandg: B —> Abe injections. To prove that A is equinumerous to 


B, we define a one-to-one correspondence h : A > B. To do this, we recursively 
define two sequences of sets by first letting 


Cy = A \ ran(g) 


and 

Do = f[Col 
Then, for n € @, 

Chui = g[D,] 
and 

D, = f{Cyl. 


This is illustrated in Figure 6.2. Note that both {C,,: 1 € wm} and {D, :n€ @} 
are pairwise disjoint because f and g are one-to-one (Exercise 11). Define h by 


h=ft (U cy) ue" Ganeca! 


neo neo 
We show that h is the desired function. 


¢ Let x,;,x. © A such that x; # x5. Since both f and g are one-to-one, we 
only need to check the case when x; € C, for some k € wand x, € C, 
for alln € w. Then, f(x) € D, but g~!(x.) ¢ Dy, so f(x) # go! (xp). 
That is, h(x,) # A(x). 


Take y € B. If y € D, for some k € a, then y = f(x) for some x € C,. 
That is, y = h(x). Now suppose y ¢ eer D,,. Clearly, g(y) € Co. 
If g(y) € C, for some k > 0, then y € D,_,, a contradiction. Hence, 
g(y) = x for some x € ran(g) \ ee C,,. This implies that we have 
A(x) = g7'(x) = y. 


302 Chapter 6 ORDINALS AND CARDINALS 


A \ ran(g) 


ran(g) 


A 


Co 


Bee 


ran(g) \ LJG, 


neo 


neo 


Figure 6.2 The Cantor—Schréder—Bernstein theorem. 


B\ UD, 


neo 


Section 6.2 EQUINUMEROSITY 303 


WM EXAMPLE 6.2.12 


Because (0,1) € [0,1], we have that (0,1) < [0,1], and because the function 
Ff: [0,1] > (©, 1) defined by 


1 1 
SONS SEAT 


is an injection, we have that [0,1] < (0,1), so we conclude by the Cantor— 
Schréder—Bernstein theorem (6.2.11) that (0, 1) ~ [0, 1]. 
Diagonalization 


The strict inequality A < B is sometimes difficult to prove because we must show that 
there does not exist a bijection from A onto B. The next method was developed by 
Cantor (1891) to accomplish this for infinite sets. It is called diagonalization. 

Let M be the set of all functions f : @ — {m,w}, where m 4 w. To show that 
@ < M, we prove two facts: 


e There exists an injection@ > M. 
e There is no one-to-one correspondence between @ and M. 


Cantor’s method does both of these at once. Let p : @ — M be a function. Writing 
the functions of the range of ¢@ as infinite tuples, let 


Fi = (Gigs @j1, G2, +++» Aijs ++ )s 
where a;; € {m, w} for all i,j € w. For example, 
P(4)(3) = f4(3) = 443. 
Now, write the functions in order: 
Fo = 00+ 401+ 402» +++ + 40j> +++)» 


fi = (49,441,412,---.41j,+---)s 


to = (a9, Q71,422,..-. 242), ays 


Fj = Gyo Git s Gigs s Oise Bijrses)s 


From this, a function f € M that is not in the list can be found by identifying the 
elements on the diagonal and defining f(m) to be the opposite of a,,,. In other words, 
define for all i € a, 


304 Chapter 6 ORDINALS AND CARDINALS 


and f(n) = 5, is an element of M not in the list because for all n € @, 


f (1) F Ann. 


Since the function g mapping @ to M is arbitrary, there are injections @ — M but 
none of them are onto. Therefore, 


a<xM. 


Furthermore, note that the elements of [0,1] can be uniquely represented as binary 
numbers of the form 


O.dy a, a... Gj ..., 


where a; € {0, 1} for each i € w. For example 
1=O.1111111... 


and 
1/2 = 0.1000000... . 


Therefore, 
M x (0,1). 


Hence, we can conclude like Cantor that since [0, 1] IR (Examples 6.2.5 and 6.2.12), 
a<R. 


Cantor’s diagonalization argument can be generalized, but we first need a definition. 
Let A be a set and B C A. The function 


Xp. A {0,1} 


is called a characteristic function and is defined by 


baif ls HEB, 
a)= 
ue 0 ifagB. 


For example, if A = Z and B = {0, 1,3,5}, then yvp(1) = 1 but yp(2) = 0. Moreover, 
for every set A, 


AQ = {vp : BCA}. 
The characteristic function plays an important role in the proof of the next theorem. 


M@ THEOREM 6.2.13 


If Aisaset, A < 42. 


Section 6.2 EQUINUMEROSITY 305 


PROOF 
Since the function y : A > 42 defined by 


y(a) = X {a} 


is an injection, 42 dominates A. To show that A is not equinumerous with 42, 
we show that there is no surjection A > 42. Let gy : A > 42 bea function, and 
for all a € A, write g(a) = XB, for some B, C A. Define y so that 


_ 1 if yp (a) = 0, 
x(a) = {< if xp, (a) = 1. 


Therefore, y ¢ ran(g) because XB, (4) # y(a) for all a € A. However, y € 42. 
To prove this, define 
B={aeA: XB, (4) = 0}. 


We conclude that y(a) = yp(a) because if XB, (@) = 0, thena € B,so y(a) = 1 
and y,(a) = 1, but if XB, (@) = 1, thena ¢ B, so y(a) = 0 and yp(a) = 0. 
Therefore, y = yz, and ¢ is not onto. 


By Exercise 14, 
P(A) ~ 42. 


This result combined with Theorem 6.2.13 quickly yield the following. 


H@ COROLLARY 6.2.14 
If Aisaset, A < P(A). 


From the theorem, we conclude that there exists a sequence of sets 
@ < P(@) < P(P(@)) < P(P(P(@))) <-:--. 
Thus, there are larger and larger magnitudes of infinity. 


Exercises 


1. Given yp: Z > {0,1} with B = {2,5, 19, 23}, find 
(a) xp() 
(b) xp(2) 
(©) x~(—10) 
(d) xp(019) 
2. Find a surjection g : @ X w > @ showing thatw <x wxo. 


3. Prove the given equations. 
(a) Xaup = XaA+ XB- XAXB 
(b) XanB = LAX 


4. Prove that there exists a bijection between the given pairs of sets. 


306 Chapter 6 ORDINALS AND CARDINALS 


(a) [0,z] and [-1, 1] 

(b) [-2/2,2/2] and [-1, 1] 

(c) (0,0) andR 

(d) wandZ 

(e) Zt and Z~ 

(f) {(x,0):x €R}andR 

(g) ZandZxZ 

(h) {(x,y)E RXR: y=2x+4} andR 


5. Let a < bandc < d, where a, b,c,d € R. Prove. 
(a) (a,b) & (c,d) 
(b) [a,b] + [c,d] 
(c) (a,b) & [c,d] 
(d) (a,b) & (c,d] 


6. Prove R x C. 


7. Let A, B, C, and D be nonempty sets. Prove. 
(a) N<Z 
(b) AN BX P(A) 
(c) AXAXB 
(d) [0,2] < [5,7] 
(ec) ANBX<A 
(f) 4B<AxB 
(g) Ax {0} < (AU B)x {1} 
(h) AXBXxCXAXBXxCXD 
(ij) A\BXCxA 


8. Prove. 
(a) If A< Band BxC,thenAX<C. 
(b) If Ax Band B<C,thenA<C. 
(c) If AX Band B<C,thenAX<C. 
(d) If A< BandC < D, then 4C < ¥D. 
(e) If Ax B, then P(A) = P(B). 
(f) If Ax BandC x D,then AxXC x Bx D. 
(g) If Ax B,ae€ A,and be B, then A \ {a} = B \ {b}. 
(h) If A\ Bx B\ A, then Aw B. 
Gi) IfAC BandAX AUC, then BX AUBUC. 
gG) IfCCA,BCD,andAUB® B,thenCUDs D. 


9. Given the function g : A > B, prove that g = A and @ < ran(@). 


10. Without appealing to the Cantor—Schréder—Bernstein theorem (6.2.11), prove that 
(0, 1) = [0, 1]. 


Section 6.3 CARDINAL NUMBERS 307 


11. Prove that the sets {C,, : n € wm} and {D, : n € @} from the proof of Theo- 
rem 6.2.11 are pairwise disjoint. 


12. Show that g is a bijection, where g : Z* > Z is defined by 


(n) k-1 ifn=2kforsomek €a, 
n)= 
> Gin eA donsome KEG: 


13. Let A be an infinite set and {a,,a5,... } be a set of distinct elements from A. Prove 
that A > A \ {a,} is a bijection, where 


w= (om tant 
x 


otherwise. 


14. For any set A, prove that P(A) ~ 42. 


15. Prove that if f : A — B is a surjection, there exists a function g : B > A such 
that fog =p. 


16. Use the power set to prove that there is no set of all sets. 


6.3 CARDINAL NUMBERS 
Let A be a set. Define 
B={f: Bis anordinal A p & A}. 


By Zermelo’s theorem (6.1.25), A can be well-ordered, so by Theorem 6.1.19, A is 
order isomorphic to some ordinal, so B is nonempty. Moreover, B is subset of an 
ordinal (Theorem 6.1.20). Therefore, (B, C) has a least element, which has the property 
that it is not equinumerous to any of its elements. This allows us to define the second 
of our new types of number (page 283). This type will be used to denote the size of a 
set. 


H@ DEFINITION 6.3.1 


An ordinal « is a cardinal number (or simply a cardinal) if a # « for every 
aEeEk. 


Observe that every infinite cardinal is a limit ordinal. This is because a % at for every 
infinite ordinal a. However, a limit ordinal might not be a cardinal. 

Let « and A be cardinals. Suppose that A ~ « and A x A. By Theorem 6.2.4, 
we have that ck ~ 4. If k € Aord E€ x, this would contradict the definition of a 
cardinal number. Therefore, « = A (Corollary 6.1.15), and we conclude that every set 
is equinumerous to exactly one cardinal. 


308 Chapter 6 ORDINALS AND CARDINALS 


HM DEFINITION 6.3.2 


The cardinality of a set A is denoted by | A| and defined as the unique cardinal 
equinumerous to A. 


Observe that the cardinality of a cardinal x is x. 


Mi EXAMPLE 6.3.3 


Let & be a set of cardinals. By Theorem 6.1.16, we know that J & is an ordinal. 
Now we show that it is also a cardinal. Suppose that a@ is an ordinal such that 
a &% Uo. By Definition 6.3.1, we must show that |) # C a. Suppose there 
exists an ordinal 6 € |) such that 6 ¢ a. This means that there exists a 
cardinal « € & such that f € x. This is impossible because 


a<p<«<|Jowa. 


Finite Sets 


Intuitively, we know what a finite set is. Both of the sets A = {0,2,3,5,8, 10} and 
Bz={ne€Z: (n—1)(n+3) = O} are examples because we can count all of their 
elements and find that there is only one ordinal equinumerous to A and only one or- 
dinal equinumerous to B. That is, |A| = 6 and |B| = 2. This suggests the following 
definition. 


HM DEFINITION 6.3.4 


For every set A, if there exists n € @ such that A ~ n, then A is finite. If A is 
not finite, it is infinite. 


As we will see, finite sets are fundamentally different from infinite sets. There are 
properties that finite sets have in addition to the number of their elements that infinite 
sets do not have. Let us consider some of those properties of finite sets. 


@ LEMMA 6.3.5 
Let n be a positive natural number. If y € n, thenn\ {y} xn7. 


PROOF 
We proceed by mathematical induction. 


e When n = 1, it must be the case that y= 0, son\ {y} =O=17. 


¢ Take yeEn+1. If y =n, then (n+ 1) \ {n} =n, so suppose that y < n. 
By induction, there exists a bijection g : n \ {y} — n-. Then, define 
f :(n+1)\ {y} ~ nby f(m) = g(m) for all m < nand f(n) = n. The 
function f is a bijection (Exercise 3). Hi 


Lemma 6.3.5 is used to prove a characteristic property of finite cardinals. 


Section 6.3 CARDINAL NUMBERS 309 


@ THEOREM 6.3.6 


No natural number is equinumerous to a proper subset of itself. 


PROOF 
Let n € @ be minimal such that there exists A C n andn & A. Since @ has no 
proper subsets and the only proper subset of 1 is 0, we can assume that n > 2. 
Let f : A > nbea bijection and x € A and y En \ A such that f(x) = y. We 
check the following. 


e Leta € A \ {x}. If f(a) = y, then we contradict the hypothesis that f is 
one-to-one because f(x) = y and x # a. Thus, f(a) En \ {y}. 


e Take b € n \ {y}. Since f is a surjection, there exists a € A such that 
/(@ = b. Ifa =x, then {b, y} C [x] ¢ (Definition 4.2.7), which is impos- 
sible because f is a function. Thus, a € A \ {x}, and we conclude that 


Ff 1(A\ {x}) is onto n \ {y}. 


e Since the restriction of a one-to-one function is one-to-one, f [(A \ {x}) 
is one-to-one. 


Hence, fy = f | (A \ {x}) is a bijection with range n \ {y}. We have two cases 
to consider. 


¢ Suppose n~ ¢ A. This implies that A \ {x} C n™. Since x € A, we have 
that x #n~,sox En-. Thus, A\ {x} Cn-. By Lemma 6.3.5, there exists 
a bijection g: n \ {y} > n”, so we have A \ {x} ~ n™ because go fp isa 
bijection, contradicting the minimality of n. 


e Assume n~ € A. Define A’ = A \ {n-} U {y}. Since y ¢ A, A’ & A, 
which implies that A’ % n. Replace A with A’ in the previous argument 
and use f | (A’ \ {y}) to contradict the minimality of n. 
H@ COROLLARY 6.3.7 


Every finite set is equinumerous to exactly one natural number. 


PROOF 
Let A be finite. This means that A x n for some n € w. Let m € @ also have 
the property that A = m. This implies that n ~ m. Hence, by Theorem 6.3.6, we 
conclude that n = m because n Cmorm Cn. 


There are many results that follow directly from Theorem 6.3.6. The following six 
corollaries are among them. 


H@ COROLLARY 6.3.8 [Pigeonhole Principle] 


Let A and B be finite sets with B < A. There is no one-to-one function A > B. 


310 Chapter 6 ORDINALS AND CARDINALS 


PROOF 
There exists unique m,n € @ such that A © m and B & n by Corollary 6.3.7. 
Assume that f : A > B is one-to-one. Then, 


mrxAx<Bwrna, 


so m is equinumerous to a subset of n. This implies that m € n. However, n C m 
because B < A, which contradicts Theorem 6.1.14. Hi 


HM COROLLARY 6.3.9 


No finite set is equinumerous to a proper subset of itself. 


M@ COROLLARY 6.3.10 


A set equinumerous to a proper subset of itself is infinite. 


Because f : @ > @ \ {0} defined by f(n) = n + 1 is a bijection, we have the next 
result by Corollary 6.3.10. 


HM COROLLARY 6.3.11 
@ is infinite. 
The proofs of the last two corollaries are left to Exercise 5. 


HM COROLLARY 6.3.12 


If A is a proper subset of a natural number n, there exists m < n such that A & m. 


HM COROLLARY 6.3.13 


Let A C B. If B is finite, A is finite, and if A is infinite, B is infinite. 


Countable Sets 
Since @ is the first infinite ordinal, it is also a cardinal. Therefore, 
|A| = wif and only if Axo. 


As sets go, finite sets and those equinumerous with @ are small, so we classify them 
together using the next definition. 


HM DEFINITION 6.3.14 


A set A is countable if A < . 


Sometimes countable sets are called discrete or denumerable. For example, the bi- 
jection f : Z*+ > q@ defined by f(n) = n™ shows that Zt is countable. Moreover, a 
nonempty finite set is countable and can be written as 


{do, a, ed »,_1}5 


Section 6.3 CARDINAL NUMBERS 311 
for some positive integer n. A countably infinite set can be written as 


{d, a),4, oie hs 


where there are infinitely many distinct elements of the set. 


Mi EXAMPLE 6.3.15 


The set of rational numbers is a countable set. To prove this, define a bijection 
f :@ — Qby first mapping the even natural numbers to the nonnegative rational 
numbers. The function is defined along the path indicated in Figure 6.3. When 
a rational number that previously has been used is encountered, it is skipped. 
To complete the definition, associate the odd naturals with the negative ratio- 
nal numbers using a path as in the diagram. This function is a bijection, so we 
conclude that Q is countable. 


We have defined countability in terms of bijections. Now let us identify a condition 
for countability using surjections. 


M@ THEOREM 6.3.16 
A set A is countable if and only if there exists a function from @ onto A. 


PROOF 
If p : @ > Ais a surjection, by Theorem 6.2.8, A < w. Conversely, suppose A 
is countable. We have two cases to check. 


¢ Suppose A & o. Then, there is a surjection from the set of natural numbers 


to A. 
2 2 
f6) = | ——> 5 ——> ff) =3 
7 ee ae 2= | 
eee AY= 5 JQ) 
0) 0) 0) 
#0) = | ——> 5 3— 


Figure 6.3 The rational numbers are countable. 


312 Chapter 6 ORDINALS AND CARDINALS 


¢ Now let A = n for some n € o. If A = ©, then @ is a surjection@ > A. 
Thus, assume A # @. This means that we can write 


A= {dp,@1,...,4,_1}. 
Define g : @ > A by 


; a; ifi=0,1,...,n—-1, 
pli) = e 


n-| otherwise. 


This function is certainly onto. Hi 


If A, is countable for all i = 0,1,...,2—1, then Ay X A, X--+-X A,_j is countable 
(Exercise 15). In particular, @ x w and Z x Z are countable. We also have the next 
theorem. 


M@ THEOREM 6.3.17 


The union of a countable family of countable sets is countable. 


PROOF 
Let {A, : a € I} bea family of countable sets with J countable. Since we have 
that |) 2 = @ is countable (Example 3.4.12), we can assume that J is nonempty. 
For each a € J, there exists a surjection in °(A,) by Theorem 6.3.16. Therefore, 
by Corollary 5.1.11, there exists 


ie aan 03) 


such that f(a) is a surjection @ — A, for all a € I. Because I is countable, 
we have a surjection g : @ > I, and since w X @ is countable, we have another 
surjection h : @ > @ X @. We now define 


W:@Oxo-7 LAs 
ael 


by w(m, n) = f(g(m))(n) and let 
P:or U Ag 


be defined by g = yw oh. To check that @ is onto, leta € A,, some a € I. Since 
g is onto, there exists i € @ so that g(i) = a. Furthermore, since f(a) is onto, 
we have j € @ such that f(a)(j) = a, and since h is onto, there exists k € w so 
that h(k) = (i, j). Therefore, 


Pk) = (y 0 h\(k) = wij) = [EOD = (MU) =4. a 


For example, J{nZ : n € w} and | J{Q x {n} : n € a} are countable. 


Section 6.3 CARDINAL NUMBERS 313 


Alephs 


Cantor denoted the first infinite cardinal @ by No. The symbol & (aleph) is the first 
letter of the Hebrew alphabet. The next magnitude of infinity is &,, which seems to 
exist by Theorem 6.2.13. This continues and gives an increasing sequence of infinite 
cardinals, and since natural numbers must be less than any infinite cardinal, we have 


0O<1«2«---<Ny<Ni <N<---. 
For instance, 4 < &1, No X No, and &3 < Ny. 


@ EXAMPLE 6.3.18 


Although he was unable to prove it, Cantor suspected that &,; = |R|. This con- 
jecture is called the continuum hypothesis (CH). However, it is possible that 
XN, < [IR]. It is also possible that 8, = |IR|. Cantor was unable to prove CH 
because it is undecidable assuming the axioms of ZFC. In other words, it is an 
ST-sentence that can be neither proved nor disproved from ZFC. 


Now to define the other alephs. Pick an ordinal a. Define the function h by 
h(g) = least infinite cardinal not in ran(g), 
where g is a function with dom(g) € a. For example, if a = 5 and 
& = {(0, Ko), (1, Ky), (2, Kz), 3B, 3), (4, Kg), 


then A(g) is the least infinite cardinal not in {kg, Ky, Ky, K3,K4}. By Transfinite Recur- 
sion (6.1.23), there exists a unique function f with domain a and 


J (P) = least infinite cardinal not in ran(f | P) 


for all 6 € a. Define 
Ny = f(B). 


Since h(O) = @, we have that Ny = w. Moreover, the definition of f implies that 
X&, = least infinite cardinal not in {No}, 


XN, = least infinite cardinal not in {No, &,}, 
XN, = least infinite cardinal not in {Np, Xj), No}, 


XN, = least infinite cardinal not in {&,, : 1 € o}, 


Nq = least infinite cardinal not in {Nz : B € a} 


The question at this point is whether the alephs name all of the infinite cardinals. 
The next theorem answers the question. 


314 Chapter 6 ORDINALS AND CARDINALS 


M@ THEOREM 6.3.19 


For every infinite cardinal x, there exists an ordinal @ such that k = X,. 


PROOF 
Suppose x # &, for every ordinal a. We can assume that « is the minimal 
such cardinal. Thus, for all cardinals A € x, there exists an ordinal #, such that 
A= By" Therefore, the least infinite cardinal not an element of 


{Ny :A EK A Aisa cardinal} 

is the next aleph, but this is x. Hi 
M@ EXAMPLE 6.3.20 

The generalized continuum hypothesis (GCH) states that for every ordinal a, 

Rot = [PRI 
When a = 0, GCH implies that 
XN, = [PRo)| = IRI. 

which is CH (Example 6.3.18). Like CH, GCH is undecidable in ZFC. 

Like the ordinals, the cardinals can be divided into two classes. 


@ DEFINITION 6.3.21 


Let « be a nonzero cardinal number. If k € @ or there exists an ordinal a such 
that k = X,4, then x is a successor cardinal. Otherwise, « is a limit cardinal. 


For example, the positive natural numbers and & are successor cardinals, while No 
and &,, are limit cardinals. Notice that if « is a limit cardinal, 


k= Lsreknais a cardinal}. 


Exercises 
1. Prove that |a| = |at| for every ordinal a. 


2. Let p(x) be a formula. Prove that if p(a) is false for some ordinal a, then there exists 
a least ordinal f# such that p(f) is false. 


3. Prove that the function f in the proof of Lemma 6.3.5 is a bijection. 


4. Show that the following attempted generalization of Lemma 6.3.5 is false: Let a be 
an ordinal and a € A. If |A| = N,+, then |A \ {a}| =,- 


5. Demonstrate Corollaries 6.3.12 and 6.3.13. 


6. Let A and B be finite sets. Prove the following. 


Section 6.3 CARDINAL NUMBERS 315 


(a) |AU B| = |A] +|B] -|An Bl 
(b) [An Bl =|A]~|A\ Bl 
(c) |Ax Bl =|A| - |B 
7. Prove that the intersection and union of finite sets is finite is finite. 
8. Prove that every finite set has a choice function without using the axiom of choice. 
9. Let Rand R7! be well-orderings of a set A. Prove that A is finite. 
10. Show that Z is countable. 
11. Let A be infinite. Find infinite sets B and C such that A= BUC. 
12. If B is countable, prove that |A x B| = |A|. 
13. Let A and B be sets and A is countable. Prove that B is countable when A x B. 


14. Let A and B be countable sets. Show that the given sets are countable. 
(a) AUB 
(b) ANB 
(c) AXB 
(d) A\B 


15. Let Ag, A,,...,A,_, be countable sets. Prove that the given sets are countable. 
(a) Ay U Ay U:::UA,_} 
(b) Ay NA, N:+::NA,_} 
(c) Ap X Ay X++*X A,_} 

16. Prove. 


(a) If AU B is countable, A and B are countable. 
(b) If A is countable, 2a ~ 4@ 


17. Let F bea set of cardinals. Prove that |) ¥ is a cardinal. 


18. A real number is algebraic if it is a root of a nonzero polynomial with integer 
coefficients. A real number that is not algebraic is transcendental. Prove that the set 
of algebraic numbers is countable and the set of transcendental numbers is uncountable. 


19. Take a set A and define B = {a : a is an ordinal Aa < A}. [See Hartogs’ theorem 
(6.1.22).] Prove the following. 


(a) Bis acardinal. 

(b) |A| < B. 

(c) Bis the least cardinal such that |A| < B. 
20. Assuming GCH, find |P(P(P(P(P(R)))))|. 


21. Prove that for all ordinals a and f, if a € f, then Ny € Nz. 


316 Chapter 6 ORDINALS AND CARDINALS 


22. Recursively define the following function using 2 (beth), the second letter of the 
Hebrew alphabet. Let @ be an ordinal. 


mr ca Ro, 
a = 234, 
3, = U Ay if @ is a limit. 
pea 


(a) Use 2 to restate GCH. 
(b) Use transfinite recursion (Corollary 6.1.24) to prove that J defines a function. 


6.4 ARITHMETIC 


Since every natural number is both an ordinal and a cardinal, we want to extend the 
operations on @ to all of the ordinals and all of the cardinals. Since the purpose of the 
ordinals is to characterize well-ordered sets but the purpose of the cardinals is to count, 
we expect the two extensions to be different. 


Ordinals 


Definitions 5.2.15 and 5.2.18 define what it means to add and multiply finite ordinals. 
When generalizing these two definitions to the infinite ordinals, we must take care 
because addition and multiplication should be binary operations, but if we defined these 
operations on all ordinals, their domains would not be sets by the Burali-Forti Theroem 
(6.1.21), resulting in the operations not being sets. Therefore, we choose an ordinal and 
define addition and multiplication on it. Since 1 + 1 ¢ 2, the ordinal must be a limit 
ordinal. 


@ DEFINITION 6.4.1 


Let € be a limit ordinal. For all a, 6 € ¢, 
eat+0=a, 
2 at pt =(a+ yt, 
eat+fP=U{at+6:6 € BP} if pis a limit ordinal. 


As with addition of natural numbers (Definition 5.2.15), to prove that Definition 6.4.1 
gives a binary operation, let y : € — € be the successor function. By transfinite 
recursion (Corollary 6.1.24), there exist unique functions g, : a > ¢ foralla E€ ¢ 
such that 


* PO) =a 
© Pal B*) = W(Pq(B)) = Pq(B)* 


Section 6.4 ARITHMETIC 317 


¢ for all limit ordinals f € €, 


Pub) = LU ea(6). 


6ep 


Define A: €x¢ > ¢ by A(a, f) = @,(B). By the uniqueness of each @g,, the binary 
operation A is the function of Definition 6.4.1, so 


a+ pPp=A(a, fp). 


Furthermore, take ¢’ to be a limit ordinal such that ¢ C ¢’ and define w’ : 6’ > ¢' to 
be the successor function. As above, there are unique functions Ue for all a € ¢' with 
the same properties as g,. Notice that 
, 
Pa = Py lS. 


Otherwise, gi |¢ would have the same properties as g, yet be a different function, 
contradicting the uniqueness given by transfinite recursion. Next, define the binary 
operation A’ : ¢’ x ¢’ > ¢’ by A’(a, B) = og! (B). Therefore, 


A=A'I(EX¢). 


This implies that although addition is not defined as a binary operation on all ordinals, 
we can add any two ordinals and obtain the same sum independent of the ordinal on 
which the addition is defined. 

Consider m € @. Since @ is a limit ordinal, 


m+o=\|){mtn:n€o} =o. 


However, 
o+1=@+4+0t =(@+0)t =ot =@U {a}, 
and 


o+2=o04+1t=(@4+1)t =(@U {o})t =@U {a} U{oU {@}} =a. 


Therefore, addition of infinite ordinals is not commutative. Moreover, an order isomor- 
phism can be defined between w+n and ({0} X@)U({1} Xn) ordered lexicographically 
(Exercise 4.3.16) as 


0 1 2 ae n 3 @ at att 


! t ! ! ‘ { t 
(0,0) (0,1) ©,2) ... On) ... C,0) C1) d,2) 
This means that + n looks like @ followed by a copy of n. Generalizing, the ordinal 
@ + @ looks like @ followed by a copy of . In particular, 
Oa+ro= U @t+n, 
neo 


which means that the proof its existence requires a replacement axiom (5.1.9). All of 
this suggests the next result. 


318 Chapter 6 ORDINALS AND CARDINALS 


LEMMA 6.4.2 


Let € be a limit ordinal anda, 6 € ¢. Ifx €a+ Pf, thenx €aorx €a+b for 
some b € pf. 


PROOF 
Define 


A={zE€E:Vx(xE€at+z7>xE€avig €2z[x €a+g])}. 
Clearly, 0 € A, so take y € € such that seg(€,y) € A. 


¢ Let y = 6+ for some ordinal 6. Take x € a + 6*, which implies that 
x € (a+ 6)*. This means thatx €a+6orx=a+6. Ifx €a+6, then 
we are done. If x = a+ 6, thenx € (a +6)* =a+6". 


e Let y be a limit ordinal. Take x € a + y. This means that x € a + 6 for 
some 6 € y. Therefore, x €aorx €a+dforsomedeécy. Hf 


Using Lemma 6.4.2, we can prove the next useful result. 


@ LEMMA 6.4.3 
Let € be a limit ordinal. Ifa,6 €¢€,thena+fP=aU{at+b:beEfp}. 


PROOF 
Define 
A={ze€C:a+z=avu{at+g:gez}}. 


We have that 0 € A, so assume seg(¢, 7) C A, where y € €. 
¢ Suppose y = 6+. Then, 
aU{atd:dedt}=auU{lat+d:desd}uU{ats} 
=(a+6)U {a+6} 
=(a+6)t 
=at+6t. 

e Let y be a limit ordinal. Take x € a + y. By Lemma 6.4.2, we have x € a 
or x € a+ g for some g € y. If the former, we are done, so suppose the 
latter. In this case, the assumption gives 

atg=aU{at+g':2' Eg}. 


Thus, x = a+g' forsome g’ € g € y. Conversely, if x € a, thenx € a+0, 
so x € a+y. For the other case, letx € {a+g: g € y}. This implies that 
x€(a+g)* =a+g* for some g Ey. Since gt Ey, 


x€|Jia+e':g' ey} =at+y./ 


Although ordinal addition is not commutative, it does have other familiar properties 
as noted in the next result. 


Section 6.4 ARITHMETIC 319 


@ THEOREM 6.4.4 


Addition of ordinals is associative, and 0 is the additive identity. 


PROOF 
Let € be a limit ordinal. Define 


A={PEC:0+f =f}. 
Suppose y is an ordinal such that seg(A, y) C A. We have two cases to check. 


¢ Let y = 6* for some ordinal 6. Then, 


O+y =04+6+=(04+6)t =5t =y. 


e Let y be a limit ordinal. We then have 


0+y=(J0+6)=J5=y. 


bey bey 


In both cases, y € A, so by transfinite induction, A = ¢. Since we can also prove 
that 


C$={PEC:f+0=f}, 


we conclude that 0 is the additive identity. 
To prove that ordinal addition is associative, we proceed by transfinite induc- 
tion. Let a, 8,6 € ¢. Define 


B={zEC:at+(Pt+z)=(a+f)4+zZ}. 
Assume that seg(€, y) C B. Then, y € B because by Lemma 6.4.3, we have 


at+(B+y)=au|J{atx:xE pty} 
=au|J{atx:x Epvag(g ey +x =f+8)} 
=aU|Jlat+x:x © p}Ula+ (B+): 8 €7}) 
=au|J{atx:xep}ulL{@+pt+e:ge7} 
=(aup)ul {a+ fp+eg:gey} 
=(a+f)+y. 


Note that Exercise 3.4.28(e) is used on the fourth equality. Hi 


HM DEFINITION 6.4.5 


Let € be a limit ordinal. For all a, B € ¢, 


ea-0=0, 


320 Chapter 6 ORDINALS AND CARDINALS 


ea-pt=a-prta, 

ea-P=U{a-6:6 € B} if fis a limit ordinal. 
As with ordinal addition, ordinal multiplication is well-defined by transfinite recursion 
(Exercise 3). Also, as with addition of ordinals, certain expected properties hold, while 


others do not. The next two results are the analog of Lemma 6.4.3 and Theorem 6.4.4. 
Their proofs are Exercise 4. 


B LEMMA 6.4.6 


If a and f are ordinals,a-B={a-b+a:be€fPAaea}. 


M@ THEOREM 6.4.7 


Multiplication of ordinals is associative, and | is the multiplicative identity. 


Observe that 
0-a= Lt0-n:neo} = 0. 


Also, @- 1 = 1-@ = @ (Theorem 6.4.7), so multiplication on the right by a natural 
number behaves as we would expect in that 


o:-2=0:1+@a=o0+0 (6.1) 
and 
@:-3=0@-2+@=0+0+0, 
but 
2-@=|(J(2-n:ne€o}=a 
and 


3-@=|J3-n:n€0} =o. 


Hence, multiplication of ordinals is not commutative. Because of this, it is not surpris- 
ing that there are issues with the distributive law. For ordinals, there is a left distributive 
law but not a right distributive law (Exercise 9). 


@ THEOREM 6.4.8 [Left Distributive Law] 
a-(6+6)=a-f6+a-6 forall ordinals a, 6, and 6. 


Since addition of ordinals is an operation on a limit ordinal ¢, we know that for all 
ordinals a, 6,6 € €, 
a=pp>at+do=f+6 


and 
a=p>a-6=f-6. 


The next result gives information regarding how ordinal multiplication behaves with 
an inequality. 


Section 6.4 ARITHMETIC 321 


THEOREM 6.4.9 
Let a, f#, and 6 be ordinals. 


elIfacf,thenéd+acéd+f. 
eIfaCf,thena+6 Cpt. 
-Ifacfandd60,thend-acod-:f. 
eIfaCf,thena-dCP-6. 
PROOF 
We prove the third part, leaving the others to Exercise 7. Let a C f and 6 # 0. 


Then, by Lemma 6.4.6, 


6-a={6-at+d:ae€ardeb}C{6-b+d:befradecbs=6:-f7.0 


Finally, we define exponentiation so that it generalizes exponentiation on @ (Exer- 
cise 5.2.17). It is a binary operation (Exercise 3). 


Hi DEFINITION 6.4.10 


Let ¢ be a limit ordinal and a, B € ¢. 
° q@? =1 
ea? =ab-a 


ah = U {ae : 6 € P} if # is a limit ordinal. 


For example, 
and 


so raising an ordinal to a natural number appears to behave as expected. Also, 


1@=(Ju": neo} == 


and 
2° =| J(2": neo} =o. 


We leave the proof of the following properties of ordinal exponentiation to Exercise 11. 


322 Chapter 6 ORDINALS AND CARDINALS 


THEOREM 6.4.11 
Let a, f, 6 be ordinals. 
© aft) = gh . a. 


© (a8)? = ah, 


Cardinals 


Even though the finite cardinals are the same sets as the finite ordinals and every infinite 
cardinal is a limit ordinal, the arithmetic defined on the cardinals will only apply when 
the given sets are viewed as cardinals. The definitions for addition, multiplication, and 
exponentiation for cardinals are not given recursively. 


HM DEFINITION 6.4.12 
Let « and A be cardinals. 
eKk+A=|(k xX {0}) Ux {1})| 
eKk-A=(|kxAl 
0 KA = [4K]. 
Since ordinal arithmetic was simply a generalization of the arithmetic of natural num- 


bers, that the addition worked in the finite case was not checked. Here this is not the 
case, so let us check 


2+3= |(2x {0}) UGX {1} = 1{,0), C1, 0), 0, 1), A, D, (2, DF] = 5. 
Also, 
342 = |(3 x {0})U (2x {1})] = [{@, 0), CL, 0), (2, 0), (0, 1), 1, DF] = 5. 


This suggests that cardinal addition is commutative. This and other basic results are 
given in the next theorem. Some details of the proof are left to Exercise 14. 


M@ THEOREM 6.4.13 


Addition of cardinals is associative and commutative, and 0 is the additive iden- 
tity. 


PROOF 
Let x, A, and yp be cardinals. Addition is associative because 


(x x {O})UL(LA x {O}] U [wx {1})) x (13) 
is equinumerous to 


(le x {O}JULAX {1}]) x {O}] UC x {1)), 


Section 6.4 ARITHMETIC 323 


it is commutative because 
(k x {O})UCAx {1}) & AX {1}) U(x {0}), 
and 0 is the additive identity because 
(« x {0}) U(Ox {1}) = (« x {0}). 
Now let us multiply 
2-3 =|{0, 1} x {0, 1,2}] = |{(0,0), (, 1), 0, 2), (1,0), 1, 1), C1, 2)}] = 6 
and 
3-2 =|{0, 1,2} x {0, 1}] = |{(0, 0), (0, 1), (1,0), (1, 1), (2, 0), (2, 1)}| = 6. 


As with cardinal addition, it seems that cardinal multiplication is commutative. This 
and other results are stated in the next theorem. Its proof is left to Exercise 15. 


M@ THEOREM 6.4.14 


Multiplication of cardinals is associative and commutative, and | is the multi- 
plicative identity. 


As with ordinal arithmetic (Theorem 6.4.8), cardinal arithmetic has a left distribution 
law, but since cardinal multiplication is commutative, cardinal arithmetic also has a 
right distribution law. 


@ THEOREM 6.4.15 [Distributive Law] 
Let x, A, and y be cardinals. 
ekK:A+M=K-At+K-U. 
e(K+A) HH K UTAH. 


PROOF 
The left distribution law holds because 


ex ([A x {O}] Ulm x {1}]) & (le x A] x {0}) U (Le X wx] x {1}). 
The remaining details of the proof are left to Exercise 16. 


The last of the operations of Definition 6.4.12 is exponentiation. Let « be a cardinal. 
Observe that since there is exactly one function 0 > « (Exercise 4.4.16), 


Ko = |°c| = 1 


and if «x #0, 
O* = |*0O| =O 


because there are no functions x —> 0, and by Theorem 6.2.13, 
K<2*, 


In addition, cardinal exponentiation follows other expected rules. 


324 Chapter 6 ORDINALS AND CARDINALS 


@ THEOREM 6.4.16 
Let x, A, and yp be cardinals. 


0 KATH = KA. KH, 
0 (eA = eH AM, 
e (K4)# = KH, 


PROOF 
We prove the last part and leave the rest to Exercise 17. Define 


QP: H(Ag) a3 Mic 
such that for all y € HK), acdiandpe up, 


Pw )(a, B) = w(a)(P). 


We claim that ¢ is a bijection. 


° Let y1.y% € H(AK). Assume that g(y,) = g(w2). Take a € A and ff € p. 
Then, 


Wi(a)(B) = PCy )(a, B) = P(r), B) = Wo(a)(P). 


Therefore, y, = w>, and @ is one-to-one. 


« Lety € “x. Fora € Aand f € uy, define y'/(a)\(f) = w(a, f). This 
implies that p(y’) = y, so ¢ is onto. 


Since every infinite cardinal number can be represented by an WN, let us determine 
how to calculate using this notation. We begin with a lemma. 


M@ LEMMA 6.4.17 


If n € wand x aninfinite cardinal,n+K=n-K=KkK. 


PROOF 
Let n be a natural number. Define g : (n x {0}) U(x x {1}) > k by 


gi, 0) = i for all i < n, 
g(a, 1) =n+a foralla Ex. 
For example, if n = 5, then g(4,0) = 4, @(0, 1) = 5, g(6, 1) = 11, p(@, 1) =a, 


and gp(@ + 1,1) = w+ 1. Therefore, n + « = x because ¢ is a bijection. That 
n-k =X is left to Exercise 18. Hi 


Lemma 6.4.17 allows us to compute with alephs. 


Section 6.4 ARITHMETIC 325 


@ THEOREM 6.4.18 
Let a and £ be ordinals and n € a. 


ent, =n-R=Ng. 


x, ifa> 8B, 


eR +8,=R8%,-Re= & 
. B oe oe otherwise. 


PROOF 
The first part follows by Lemma 6.4.17. To prove the addition equation from 
the second part, let a and f be ordinals. Without loss of generality assume that 
a C f. By Definition 6.4.12, 


Ny + Ny = [Ry X (0}) UR x (DL. 
Since Ng = [Np x {1} 1, 
Np < Ia X (0) UR x (IDL 
Furthermore, because &, © Np. 
[aX {0}) Up X (IDI S 1p x (0) U Rp x (DI 
= [Nyx {0,1} 
= Np. 


Because of Lemma 6.4.17, the last equality holds since Ny is infinite and {0, 1} 
is finite. Hence, 
Rat Rp & Nz 


by the Cantor-Schréder—Bernstein theorem (6.2.11). Since both &, +& B and X& B 
are cardinals, Ny + Ng = N,. 


For example, N5 + No = N5 No = No. More generally, we quickly have the following 
corollary by Theorem 6.3.19. 


HM COROLLARY 6.4.19 


For every infinite cardinal x, botthhk +x =Kandk-K =k. 


Exercises 


1. Let fH = {Ap, Ay, ..., A,_,} be a pairwise disjoint family of sets. Assuming that 
the sets are distinct, prove that the cardinality of ) # is equal to the sum 


[Aol + [Ay] + +++ +1A,-11- 


2. Let a and f be ordinals. Let g : a > f be a function such that g(6) € g(y) for all 
6 € y € a (Compare Lemma 6.1.2). Prove the following. 


326 Chapter 6 ORDINALS AND CARDINALS 


(a) aC Pp. 
(b) 6 C g(6) forall 5 Ea. 


3. Let ¢ bea limit ordinal. Use transfinite recursion to prove that ordinal multiplication 


(Definition 6.4.5) and ordinal exponentiation (Definition 6.4.10) are binary operations 
oné. 


4. Prove Lemma 6.4.6 and Theorem 6.4.7. 

5. For every ordinal a, prove that 0- a = 0. 

6. Prove that for alln € w,n+@ = n-q@as ordinals. Can this be generalized to 
a+fP=a- Pp for ordinals a € f with f being infinite? If so, is the a € B required? 

7. Prove the remaining parts of Theorem 6.4.9. 

8. Find ordinals a, #, and 6 such that the following properties hold. 


(a) acfbutf+6 Cato. 
(b) ac pPbutf-6 Ca-s. 


9. Let a, B, and 6 be ordinals. 
(a) Prove thata-(6+6)=a-B+a-6. 
(b) Show that it might be the case that (a+ f)-6#a-6+ 8-6. 
(c) For which ordinals does the right distribution law hold? 
10. Let a, f£, and 6 be ordinals. Prove the following. 
(a) a+f €a- dif and only if fp € 6. 
(b) a+f=a+6if and only if f= 6. 
(c) Ifa+6d€f+6,thena € fp. 
(d) a: fE€a-difand only if 6 €d anda £0. 
(ce) Ifa-f=a-6,thenf=dora=0. 


11. Prove Theorem 6.4.11. 


12. Let a, f, and 6 be ordinals. Prove the following. 
(a) a? € @° if and only if 8 €aand 1 Ea. 
(b) Ifa € £, then a? C f°. 
(c) Ifa@® € f°, thena € fp. 
(d) If 1 Ea, then BC a?. 
(e) Ifa € £, there exists a unique ordinal y such thata + y = f. 


13. Prove that the ordinal @ + @ is not a cardinal. 

14. Provide the details for the proof of Theorem 6.4.13. 
15. Prove Theorem 6.4.14. 

16. Provide the details to the proof of Theorem 6.4.15. 
17. Prove the remaining parts of Theorem 6.4.16. 


18. Prove that for any natural number n and infinite cardinal k,n-K =k. 


Section 6.5 LARGE CARDINALS 327 


19. Prove that if x is inifinte, (ct)* = 2*. 
20. Is there a cancellation law for ordinals or for cardinals? 


21. Prove thatk+/A = «-A = A, given that x is a countable cardinal and / is an infinite 
cardinal. 


22. Generalize Exercise 21 by showing that if « and 4 are cardinals with A infinite 
such that k < A, thence +A=K-A=A. 


23. Let x and A be cardinals with No < 4. Show that if 2 < x < A, then x4 = 24. 
24. Prove for all ordinals a that |a| < X,. 
25. For all ordinals a, define Hartogs’ function by 
T(a) = {f: Bis an ordinal A fp < a}. 
Prove that (a) is an ordinal and a < I(q) for all ordinals a. 
26. Using Exercise 25, define an initial number @, as follows. 
Wo = @, 
w+ =T(@,), 


oo, = |_J a5 if y isa limit ordinal. 
6Ey 


Prove that initial numbers are limit ordinals and @, is the first uncountable ordinal. 
27. Prove that there is no greatest initial number. 


28. For all ordinals a, show that &, = |@,|. Can we write X, = @,? 


29. For all countable ordinals a, show that 2%« = &,+ implies that ho = Ro 4 


6.5 LARGE CARDINALS 


Since every cardinal is a limit ordinal, every cardinal « can be written in the form 


k=|Jla:aex}. 


In particular, for the limit cardinal & we have that 


oto? 
Roto = ta :4ERarol- (6.2) 
Notice that (6.2) is the union of a set with X,,,,, elements. However, we also have 
Noro = Xs @ €@+ o}, (6.3) 
and 
Noro = LtXoin 2 EO}. (6.4) 


Both (6.3) and (6.4) are unions of sets with Ng elements. The next definition is in- 
troduced to handle these differences. Since infinite cardinals are limit ordinals, the 
definition is given for limit ordinals. 


328 Chapter 6 ORDINALS AND CARDINALS 


HM DEFINITION 6.5.1 


The cofinality of a limit ordinal a is denoted by cf(a) and defined as the least 
cardinal A such that there exists F¥ C a with |F| =Aanda =U F. 


Observe that cf(a) < |a| because a = Ja. There can be other sets B such that 
a = J B, and they might have different cardinalities. Any set of ordinals B with this 
property is said to be cofinal in a. Moreover, we can write B = {f;5 : 6 € x} for some 


cardinal «, so 
«=P, 
6€K 


a= U Bs. 


d6€cf (a) 


and when x = cf(qa), 


WM EXAMPLE 6.5.2 


Since the finite union of a finite set is finite, the cofinality of any infinite set must 
be infinite. Therefore, because 
o= U n, 


we see that cf(@) = No. Also, since 


we conclude that cf(&,,) = No and {&, : 1 € @} is cofinal in X&,,. However, 
cf(&)) = &, because the countable union of countable sets is countable (Theo- 
rem 6.3.17). 


Regular and Singular Cardinals 


Since infinite cardinals are limit ordinals, we can classify the cardinals based on their 
cofinalities. We make the following definition. 


H@ DEFINITION 6.5.3 
A cardinal x is regular if « = cf(x), else it is singular. 


Notice that Example 6.5.2 shows that No and &, are regular but &,, is singular. This 
implies a direction to follow to characterize the cardinal numbers. We begin with the 
successors. 


M@ THEOREM 6.5.4 


Successor cardinals are regular. 


Section 6.5 LARGE CARDINALS 329 


PROOF 
Let a be an ordinal and let F¥ C N,4; such that Xj4; =U F. This implies that 
|P| SX, for all 6 € F. Thus, 


UF <1F1-. 


By Theorem 6.4.18, we conclude that &,,, < |F|, so cf(&,41) = Na+1 by the 


Cantor—Schréder—Bernstein theorem (6.2.11). 


Since No is regular, Theorem 6.5.4 tells us where to find the singular cardinals. 


HM COROLLARY 6.5.5 


A singular cardinal is an uncountable limit cardinal. 


Because we have not proved the converse of Corollary 6.5.5, we investigate the cofi- 
nality of certain limit cardinals in an attempt to determine which limit cardinals are 
singular. 


M@ THEOREM 6.5.6 


If @ is a limit ordinal, cf(&,) = cf(a). 


PROOF 
The proof uses the Cantor-Schréder—Bernstein theorem (6.2.11). Let a be a limit 
ordinal. 


¢ We first show that cf(&,) < cf(a@). Let A be a cofinal subset of a such that 
|A| = cf(a). Notice that if 6 € A, then Nz € X,. On the other hand, take 
6 € N,, which implies that |6| < &,. Therefore, there exists y € a such 
that |6] = &,. Since A is cofinal in a, there exists ¢ € A such that 6 € ¢. 
Hence, 6 € X¢, which implies that 6 € U{&, : B € A}. Therefore, 


N, = UtXs: 6 € Al 


from which follows, 
cf(&,) < |A| = cf(@). 


We now show cf(a) < cf(&,). Let A € &, so that &, = UJ A and |A| = 
cf(&,). Define 


F={6ea:sy(y € AA ly|=Ns5)}- 


Then, £ = [J F is an ordinal by Theorem 6.1.16. For all € € A, we have 
that ¢ € Ng, because |¢| < Ng. Hence, 


LJAcr,, 


from which follows that a € £6, which means that a C U F. Therefore, 
since the elements of ¥ are ordinals of a, we have that a = U F,sSo 


cf(a) < |F| = |A| =cf(X,). 


330 Chapter 6 ORDINALS AND CARDINALS 


Theorem 6.5.6 confirms the result of Example 6.5.2 because 


cf(&,,) = cf(@) = No. (6.5) 
Also, 
cf(N a4) = cf(@ + @) = No, (6.6) 
SO N4~ 18 singular. Observe that by (6.5) and (6.6), 
cf (cf (X,,)) = cf(cf(@)) = cf(No) = Ro 
and 


cf(cf(®,4,.)) = cf(cf(@ + @)) = cf(No) = No. 


The next result generalizes this and proves that cf(cf(a@)) = cf(a@) for every limit ordinal 
a. 


@ THEOREM 6.5.7 


For any limit ordinal a, cf (qa) is a regular cardinal. 


PROOF 
Let a be a limit ordinal and write a = U{A, : y € cf(a)}. For every ordinal 
y € cf(q), define a, = Usey As. Then, {a, : y € cf(a)} is a chain of ordinals 
(Theorem 6.1.16) and 


a=|Jla,:7 €cf(a)}. 
Now write 
cf(a) = ts, :y € cf(cf(a))}. 


Define 
B= 1p, :y €cf(cf(a))}. 


Let € € a. This implies that € € ay, for some 79 € cf(a). Then, 79 € By, for 
some y, € cf(cf(a)). Hence, 


Cea, Cay €B, 


Y 


so¢ € JB. Therefore, a = () B, and this implies that cf(a) < cf(cf(a)). Since 
the opposite inequality always holds, by the Cantor—Schroéder—Bernstein theorem 
(6.2.11), cf(@) = cf(cf(a)), which means that cf(q@) is regular. 


Although the Continuum Hypothesis cannot be proved, it is possible to discover 
some information about the value of 2%. Notice how its proof resembles Cantor’s 
diagonalization (page 303). It is due to Konig (1905). 


@ THEOREM 6.5.8 [Konig] 


If « is an infinite cardinal and cf (x) < A, then k < K?. 


Section 6.5 LARGE CARDINALS 331 


PROOF 
Suppose that « is am infinite cardinal number and cf(«) = A. Write 


k=|Jtd,:@€ Aj. 
Let F = {fz : B € x} bea subset of 4x. Define g : A > x such that 
g(a) = least element of x \ {fp(6q) : BE dg}. 
For any a € A, 
&(a) # fp(5,) for all B € dy. 
Therefore, 


g # fy forall B € dy. (6.7) 


Since (6.7) is true for all a € A and {6, : a € A} is cofinal in x, we conclude 
that g # f, forall  € x. Therefore, g ¢ F, soK < x’. Note that the same 
argument leads to this conclusion if cf(«) < A (Exercise 4). Hi 


HM COROLLARY 6.5.9 


Let x be an infinite cardinal. Then, « < cf(2*). 


PROOF 
Suppose that cf(2*) < «x. By KG6nig’s theorem (6.5.8), Theorem 6.4.16, and 
Corollary 6.4.19, 


2K < (2k)FC) < (25) = 28" = 2" 
By Corollary 6.5.9, 
Ny < cf (2%), 
but by Example 6.5.2, we know that cf(&,,) = No, so 
cf(€,,) < cf (20). 
Hence, even though we cannot prove what the cardinality of 2%0 is, we do know that it 
is not X,,. 
Inaccessible Cardinals 


As we have noted, Ng is both a regular and a limit cardinal. Are there any others with 
this property? 
@ DEFINITION 6.5.10 
A regular limit cardinal that is uncountable is called weakly inaccessible. 
It is not possible using the axioms of ZFC to prove the existence of a weakly inac- 


cessible cardinal. Here is another class of cardinals “beyond” the weakly inaccessible 
cardinals. 


332 Chapter 6 ORDINALS AND CARDINALS 


M@ DEFINITION 6.5.11 


The cardinal x is strongly inaccessible if it is an uncountable regular cardinal 
such that 24 < x for all A < x. 


Since every strongly inaccessible cardinal is weakly inaccessible (Exercise 1), it is 
not possible to prove from the axioms of ZFC that a strongly inaccessible cardinal 
exists. However, it is apparent that assuming GCH, x is a weakly inaccessible cardinal 
if and only if x is a strongly inaccessible cardinal (Exercise 10). Cardinal numbers such 
as these are known as large cardinals because assumptions beyond the axioms of ZFC 
are required to “reach” them. 


Exercises 
1. Prove that every strongly inaccessible cardinal is weakly inaccessible. 


2. Let n € w. Show that cf(&,,) = Xp. 


3. For all limit ordinals a, 6, and 6, show that if @ is cofinal in # and f is cofinal in 6, 
then a is cofinal in 6. 


4. Rewrite the proof of K6nig’s theorem (6.5.8) assuming that cf(«) < A. 


5. Let a and f be limit ordinals. Prove that cf(@) = cf(a) if and only if (@, C) and 
(6, CG) have order isomorphic cofinal subsets. 


6. Let a be a countable limit ordinal. Show that cf(a@) = No. 


7. Let « and A be cardinals such that « is infinite and 2 < 4. Show the following. 
(a) Kk <cf(A*). 
(b) K< KF), 
8. Assume GCH and let a and f be ordinals. Prove. 
(a) IfN,y <cf(X,), then 8h” =%,. 
(b) If cf(Xy) < Ny < Ny then No? = Ny. 
9. Let @ be a limit ordinal. Show that cf(2,) = cf(a@). (See Exercise 6.3.22.) 


10. Prove that GCH implies that a cardinal is weakly inaccessible if and only if it is 
strongly inaccessible. 


11. Let « be acardinal. Prove the following biconditionals. 
(a) x is weakly inaccessible if and only if « is regular and &, = Kk. 
(b) « is strongly inaccessible if and only if « is regular and 0, = x. 


CHAPTER 7 


MODELS 


7.1 FIRST-ORDER SEMANTICS 


We now return to logic. In Section 1.5, we proved that propositional logic is both sound 
and complete (Theorems 1.5.9 and 1.5.15). We now do the same for first-order logic. 
We have an added complication in that this logic involves formulas with variables. 
Sometimes the variables are all bound resulting in a sentence (Definition 2.2.14), but 
other times the formula will have free occurrences. We need additional machinery to 
handle this. Throughout this chapter, let A be a first-order alphabet and S its set of 
theory symbols. We start with the fundamental definition (compare Definition 4.1.1). 


HM DEFINITION 7.1.1 


The pair 2f = (A, a) is an S-structure if A 4 @ and a is a function with domain 
S such that 


e a(c) is an element of A for every constant c € S, 
¢ a(R) is an n-ary relation on A for every n-ary relation symbol R € S, 


¢ a(f) is an n-ary function on A for every n-ary function symbol f € S. 
333 


A First Course in Mathematical Logic and Set Theory, First Edition. Michael L. O’ Leary. 
© 2016 John Wiley & Sons, Inc. Published 2016 by John Wiley & Sons, Inc. 


334 Chapter 7 MODELS 


The set A is the domain of 2{ and is denoted by dom(2f). The domain of the 
function a is the signature of the structure. If S = {s0, 51, 5,... }, we often 
write the structure as 


(A, a(So), a(sy), a(sy), tee ) 


or 


UW WU Ww 
(Ay 55555 955 aoe) 


The font used to identify a structure and its function is the traditional one. It is 
called Fraktur and can be found in the appendix. 


The purpose of the function a is to associate a symbol with a particular object in 
the domain of the structure. For this reason, if s is an element of the signature of the 
structure, the object a(s) is called the meaning of s and the symbol s is the name of 


a(s). 
@ EXAMPLE 7.1.2 


We are familiar with the constant, function, and relation symbols of NT (Exam- 
ple 2.1.4). We define an NT-structure 2f with domain w. To do this, specify the 
function a: 


a(0) = 9, 

a(1) = {©}, 

a(+) = {(m,n),m+n):m,n€ oa}, 
a(-) = {((m,n),m-n):mneéa}. 


Notice that @ is the meaning of 0 and | is the name of {@}. Also, observe 
that a(+) and a(-) are the addition and multiplication functions on @ (Defini- 
tions 5.2.15 and 5.2.18). These operations are usually represented by + and -, 
but these symbols already appear in NT. Therefore, for the structure 2, 


o* = a(0), 
1™ = a(1), 
+ = a(+), 
= a(-), 


so 2 = (@, a) is an NT-structure with signature {0, 1,+,-} that can be written as 
(@, 9, {SD}, {(m,n),m+n): m,n € wo}, {((m,n),m-n): m,n € w}) 


or, more compactly, 
(co, 07, 17%, 48), 


Section 7.1. FIRST-ORDER SEMANTICS 335 


M@ EXAMPLE 7.1.3 


When the symbols of a structure’s signature are not needed to represent the mean- 
ings of the symbols, the notation of Example 7.1.2 is not needed. This is common 
for GR-structures. For example, if b(e) = 0 and b(o) = +, then 


(Z, 6b) = (Z,0, +) 


is a GR-structure with signature {e,o} (Example 2.1.5), and if c(e) = 1 and 
c(o) = -, then 


(R \ {0}, c) = (RB \ {0}, 1, -) 


is also a GR-structure with signature {e, o}. 


Satisfaction 


The purpose of a structure is to serve as a universe for a given language. Recall that 
the terms of a language represent objects and the formulas of a language describe the 
properties of objects (Figure 2.3). With this in mind, we now make sense of the terms 
and formulas within structures. The first step in doing this is to define how to give 
meaning to terms. This is done so that the terms represent elements of the domain of 
a structure. The second step is to develop a method by which it can be determined 
which formulas hold true in a structure and which do not. We begin by defining the 
interpretation of terms. 


HM DEFINITION 7.1.4 


Let 2 = (A, a) be an S-structure. Define an S-interpretation of 2f to be a func- 
tion I : TERMS(A) = A that has the following properties: 


¢ If x is a variable symbol, then I(x) € A, thus assigning a value to x. 
e Ifc is aconstant symbol, [(c) = a(c). 


¢ If f is a function symbol and fo, f), ...,f,_1 are S-terms, 


I(f (to, th, ae) ty—1)) = a(f)(I(tg), I(t,), $809) I(ty_1)). 


Definition 7.1.4 is an example of a definition by induction on terms. This means 
that first the definition was made for variable and constant symbols, and then assuming 
that it was made for terms in general, the definition is made for functions applied to 
terms. Induction on terms simply follows Definition 2.1.7. A proof done by induction 
on terms is one that uses the same process to prove a result about terms. 


Mi EXAMPLE 7.1.5 


Let (@, 0%, 1%, +" .%) be the NT-structure from Example 7.1.2. Let I be an 
NT-interpretation such that 


336 Chapter 7 MODELS 


I(x) =5, 
I(y) =7, 
T(0) = 0*, 
1d) =1% 


Then, I(x + y) = 12 is the interpretation of x + y because 
T(x + y) = a(+)(1(x), L(y) = a(4)6, 7) = 5 +" 7, 
and I(0 - 1) = 0% because 


I(0- 1) = a(-)(1(0), T(1)) = a(-)(0, 1) = 0% 


We are now ready for the main definition. It describes what it means for a formula 
to be interpreted as true. The definition, which is foundational to model theory, is 
generally attributed to Alfred Tarski. This contribution is found in two papers, “Der 
wahrheitsbegriff in den formalisierten sprachen’” (1935) and “Arithmetical extensions 
of relational systems” with Robert Vaught (1957). 


@ DEFINITION 7.1.6 


Let 2f = (A, a) be an S-structure and J be an S-interpretation of 2f. Assume that 
pand gq are S-formulas and fo, t), ...,t,_, are S-terms. Define F as follows: 


© WE ty =t, UI] > Ito) = Tt,). 

© WE Rto.tys sty) UZ] & Ut), H(t), «1 (ty_1)) € a(R). 
e WE 7p [1] S (not AE p [/]). 

* AME po gl] S (fA E p [I] then AE q [/]). 


¢ WE Axp [I] > (QE p [1%] for some a € A), where for every u € A, the 
function I is the S-interpretation of 2f such that if y is a variable symbol, 


u if y= x, 
IV) = 
=) te ify # x, 


and if c is a constant symbol, I?(c) = I(c). 


Definition 7.1.6 is an example of a definition by induction on formulas. This means 
that first the definition was made for the basic formulas involving equality and relation 
symbols, and then assuming that it was made for formulas in general, the definition 
is made for formulas written using connectives or quantifiers. Induction on formulas 
simply follows Definition 2.1.9. A proof done by induction on formulas is one that uses 
the same process to prove a result about terms. 

Definition 7.1.6 can be extended to the other connectives and the existential quanti- 
fier using the next theorem. 


Section 7.1. FIRST-ORDER SEMANTICS 337 


@ THEOREM 7.1.7 


Let 2f = (A, a) be an S-structure and J be an S-interpretation of 2f. Assume that 
p and q are S-formulas. 


© WE pAg UW) S QE p [J] and WE gq []). 

© WE pvgWl] Se Qe pl] or AE gq [/)). 

© WE pll]oqgs (WE p[/] if and only if WE gq [/)). 
° WE Vxp LT] > QE p[T%] for alla € A). 


PROOF 
Since p V gq @ 7p > q (page 59), we have the following: 


AWE pvali|eAkapo ql] 
© if 24 7p [J], then 2 EF g [J] 
© if not 2 p [J], then A&E g [J] 
© WE p [J] or WE gq [7]. 


Since Vxp < 74dx7p by QN, we have the following: 


WE Vxp [I] & AWE AAx7p [T] 
© not & EF Ax-7p [J] 
= not (2 F sp [I?] for some a € A) 
= not (not 2 F p [1%] for some a € A) 
= WE p [I!) for alla € A. 


The remaining parts are left to Exercise 1. Hi 


@ EXAMPLE 7.1.8 


Let the ST-structure 2 be defined as (w, €). Since ST has only relation symbols, 
2 is called a purely relational structure. Although we usually make a notational 
distinction between a name and its meaning, we do not do this with the € sym- 
bol. Likewise, the equality symbol = can be used in a formula and during any 
interpretation of the formula. To see how this works, define 


p:=VxVwzxEyAyEzZ7xX EZ). 


We show that 2£ F p for any ST-interpretation [. To determine what needs 
to be done to accomplish this, we work backwards using Definition 7.1.6 and 
Theorem 7.1.7. Let J be an ST-interpretation and assume that 


WE VxVwWzxEyAyEez7xeEz) [I]. 


This implies that 


338 Chapter 7 MODELS 


AE xeyAyez>x ez [((1%))<] forall a,b,c Eo. 


How this formula is interpreted is still based on the order of connectives found 
in Definition 1.1.5. Therefore, since the conjunction has precedence, we apply 
Definition 7.1.6 to find that 


: b 
for alla,b,c Ea, ifWeExeyAyEez KCC hak ab 
b 
then AE x Ez Kehoe aE 
Hence, by Theorem 7.1.7, 
for alla,b,c Ea, ifWexey Choa and &WFEyez (19), 
b 
then AE x Ez Kehoe aE 
We now apply the interpretation using Definition 7.1.4 to find that 
: b b b b 
for all a, b,c € @, if (IY),)500) € (A),)EO) and (IY),)5O) € (AZ) 2), 
then (LE) )5(x) = ((12)8)5(2). 
Therefore, 
for all a,b,c € wa, ifa € band b Ec, thena Ec, 


which follows by Theorem 5.2.8. This means that we can back through the steps 
to prove that 2f F p [I]. 


Typically, formulas cannot be understood as true or false on their own. They have to 
be examined against a given universe. The structure is that universe, and the examining 
is done by the interpretation. For this reason, the structure-interpretation pair forms the 
basis for our work in first-order logic, so we give it a name. 

@ DEFINITION 7.1.9 


Let J be an S-interpretation of the S-structure 2f. Let p be an S-formula. The pair 
(2, F) is called an S-model. Additionally, if 2¢ — p [J], then (2f, J) is a model 
of p and satisfies p. If (2l, F) is not a model of p, write 2 F p [7]. 


M@ EXAMPLE 7.1.10 


Let 2 be the NT-structure with domain and signature {0, 1,+,-} from Exam- 
ple 7.1.2. Let J be an NT-interpretation of 2€ such that 


I(x) = 0", 1(x,) =i+™ 1%, and 1(1) = 1. 
¢ (2, I) is a model of x4 + x7 = x7 + x, because by Theorem 5.2.17, 
(x4 +7) = 1(x4) +2 T(x7) 
=54+%8 
=84+%5 
= I(x7) +™ I(x4) 
= (x7 + x4). 


Section 7.1. FIRST-ORDER SEMANTICS 339 


e (2, J) is a model of Vx(x = x, > x+1=x>+4+1). To see this, taken € w 
and assume that 2% F x = x [J"’]. This implies that n = I(x). Hence, 


n+™% om — I(x) ea 
n+™ T(1) = I(x) +™ 1), 
T(x) +% 1) = (xq) +™ (0), 
Ix +1) = 2. + 1). 


Therefore, 2 F x + 1 = x, +1 [2]. We conclude that for all n € @, 
if WE x = x LY], then WE x+1=x.+1 WY), 
so for alln € a, 
WEx=x,>xt+l=x,+1[N. 
This implies that 


WE VxX(x = xX, > x+1l=x,4+1) [7]. 


Definition 7.1.9 can be generalized to sets of formulas. 


@ DEFINITION 7.1.11 


If (20, 7) is an S-model and F is a set of S-formulas, 
WE F [J] if and only if WE p [J] for all p € F. 
If QE F [7], then (2, J) is a model of F and satisfies F. 
For example, using the NT-model of Example 7.1.10, we see that 
WE {xy tx = X7 +24, VX(X =X, 7 X4+1=2x,4+)} I, (7.1) 


so (2, I) is a model of {x4 +x7 = x7+x4,Vx(x =X, > x+1=x,+))}. Hereisa 
more involved example. 


M@ EXAMPLE 7.1.12 


Let 8 be the GR-structure (Z, b) so that b is defined by b(e) = 0 and b(o) = +. 
Let the interpretation J have the property that 


T(x,) i/2 if i is even, 
xX:)= 
; (i+1)/2 ifiis odd. 


© BE Ax(x7 ox =e) [J]. This is because 


BE x, ox=e[J-*], 


340 Chapter 7 MODELS 


and this holds since 4 + (—4) = 0. 
¢ BE Vxdy(x o y =e) [J]. To prove this, take n € Z. Then, 
BE Ay(xoy=e) [J"] 


because 
BExoy=e [(Jy),,"]- 


This last satisfaction holds because —n € Z andn+—n = 0. 


e The previous two satisfactions demonstrate that 


BE {Ax(x7 ox =e), Vxdy(x oy =e)} [J]. (7.2) 
The previous work with models leads to the following definition. 


@ DEFINITION 7.1.13 


The S-formula p is S-satisfiable (denoted by Satgp) if there is an S-structure 2 
and S-interpretation J such that (2f, [) is a model for p. The set of S-formulas ¥ 
is S-satisfiable (denoted by Sats) if there is a model for F. 


By (7.1), we have that 
Satyr {X4 + x7 = x7 + x4, Vx[(x + x5) + xg =X + (x5 + XI}, 


and by (7.2), we have that Sater {4dx(x7 0 x =e), Vxdy(x oy =e)}. 


Groups 


As noted in Example 2.1.5, the GR symbols are intended for the study of a set with a 
single binary operation defined on it. The basic example is Z with addition. Although 
other properties will be added later, the language developed for this is designed to 
handle just the basic properties of this pair. Namely, the operation should be associative, 
the set should have an identity, and every element should have an inverse. This means 
that the axioms that will define this theory will be GR-sentences, and we need only 
three. 


Mi AXIOMS 7.1.14 [Group] 
© G1. VxVyVz[x o(yoz)=(xoy)oz|. 


e G2. Vx(eox =xAxoe=X). 


© G3. Vxdy(x oy=eAyox=e). 


Section 7.1. FIRST-ORDER SEMANTICS 341 


WM EXAMPLE 7.1.15 


Define a GR-structure © = (G, g) by letting the domain G = {0}, g(e) = 0, and 
g(0o) = +. The binary operation + is addition of integers and 0 € Z (Section 5.3). 
Let I be an interpretation of G. We show that G F {G1, G2, G3} [J]. 


e Let a,b,c € G. Since 0 is the only element of G, observe that by Defini- 


tion 7.1.4, 
(TDSC © Ly © Z)) = (LDS) + (IDI) e z) (7.3) 
= (ID) + MIDI) + OYE] (7.4) 
=0+ (+0) (7.5) 
=0+0+0 (7.6) 
= (LDPE) + (LDP + (UDINE) (7-7) 
= (IDE © y) + (IDI) (7.8) 
= (ID Y)E([x © y] © z). (7.9) 
We see that (7.6) follows from (7.5) by Theorem 5.3.5. Therefore, by The- 
orem 7.1.7, 


GExo(yozj=(xoy)oz Choa for all a,b,c € G, 
which implies that 
GE Vz[x o(yoz) =(xoy)oz] [U9] for alla,b eG. 
Therefore, 
GE VyVz[x o (yo Zz) = (xo y)oz] [If] for alla € G, 


so 
GE VxVyVz[x 0 (yoz)=(xoy)oz|] [J]. 


Let a € G. Again, by Definition 7.1.4, 
Tie ox) = Ie) + I(x) =04+0=0= 1%(x), 
so by Definition 7.1.6, 
GFeox=x[I%]. 


Also, 
I4(x 0.) = 1%(x) + 1%(e) =0+0 =0 = I(x), 


so 
GExoe=x [1%]. 


342 Chapter 7 MODELS 


Therefore, by Theorem 7.1.7, 
GFeox=xAxoe=x [I*]. 
Since a was arbitrarily chosen, 
GEVx(eox=xAxoe=x) [I]. 
e Take a € G. Because 


(23x © y) = TIDY) + TDI) = 0+0=0= UDC), 


we have 

GExoy=e [TO 
Similarly, 

GE yox=e (II, 
sO 


GExoy=eAyox=e [UO9I. 
Therefore, since 0 € G, 
there exists b € G such thatG@ FE xoy=eAyox=e (esas 
and since a was arbitrarily chosen, we have that 


for all a € G, there exists b € G such that 
GExoy=eryox=e[(I2)]. 


Then, by Definition 7.1.6, we conclude that 
for alla € G, GF Ayx oy=eAyox =e) [1%], 
so by Theorem 7.1.7, 
GE Vxdy(xoy=eAyox=e)[I]. 
Based on Example 7.1.15, we conclude that {G1, G2, G3} is GR-satisfiable (Defini- 


tion 7.1.13). That is, there exists a model in which G1, G2, and G3 are interpreted as 
true. The name of the model was first used by Evariste Galois in the early 1830s. 


HM DEFINITION 7.1.16 
A GR-structure that models the group axioms is called a group. 
A group with a commutative binary operation, one that satisfies 
VxVy(x oy=yox), 


is known as an abelian group. It is named after the Norwegian mathematician Niels 
Abel. Using the GR-structure & and interpretation J of Example 7.1.15, the set of 
GR-sentences {G1, G2, G3, VxVy(x o y = yo x)} is shown to be GR-satisfiable. 


Section 7.1. FIRST-ORDER SEMANTICS 343 


M@ EXAMPLE 7.1.17 
Using definitions of Examples 4.2.6 and 4.2.9 for Z, define 


Z, = {lal,:a€ Z}, 
and on this set, specify + by 
[a], + [6], = [a+ 5],,. 


As we have seen, the meaning of the symbol + is determined by context. The 
+ on the left is the new definition, but the + on the right is standard addition 
(Definition 5.3.4). With this definition, a generalization of Example 4.4.30 shows 
that + is a binary operation. We check that & = (Z,,, [0],,, +) is a group. Let J be 
a GR-interpretation of © such that [(e) = [0],,. We have three axioms to check. 


¢ Let [a],,,[b],,[c], € Z,, where a,b,c € Z. Observe that by Defini- 
tion 7.1.4, 


Eyal Gs broz)) 

= (Lenten coc) (Le Plnylelr(y 0 2) (7.10) 
= (1 oye) + (eye gy + Cy] 7.1) 
= [a], + (bly + [el 


=[al, + (b+ cl,) 
=[a+(b+c)], (7.12) 
=[a+b+c], (7.13) 
=[atb], +[cl, 
= ({a], + {5],) + [ely 
Se Sore ot ye: Gas 
= (yy Meo y) + (yyy ye") 15) 
= (my Pirylei (x 0 y] © 2). (7.16) 
Therefore, 


(Teeny elnyteln x 0 Ly 0 z]) = (Ten nyleln Ex 0 y] 0 z) 
for all [a],,.[b],5[¢]n © Zns 


which implies that 
GE VxVyVz[xo(yoz)=(xoy)oz] [J]. 


Notice that (7.13) follows from (7.12) by Theorem 5.3.5. More impor- 
tantly, notice that (7.10), (7.11), (7.14), (7.15), and (7.16) mimic (7.3), 
(7.4), (7.7), (7.8), and (7.9) of Example 7.1.15. The other equalities are 


344 Chapter 7 MODELS 


Qa FQ VO] x* 
QO FQ AI 
ea 7 QI 
gan nae 
ea oats 


Figure 7.1 The Klein-4 group. 


specific to this example. We conclude that in order to prove that @ is a 
model of G1, we only need to show the work specific to the structure G. 
We use this to prove G F {G2, G3} [J]. 


Let a € Z. Then, 
[0], + [a], = [0+ a], = lal, 


and 
[a],, + [0], = [a+ 0], = [al,.- 


Therefore, G F G2 [/]. 


Let b € Z. Then, 
[5], +[-5], = [6 + (—4)1, = (01, 


and 
[—5],, + [6], = [(-5) + 5], = [0],. 


Hence, © F G3 [J]. 


Therefore, G is a group. Moreover, because for all a, b € Z, 
[a], + [4], = a+ 5], = [b+ a], = [4], + [al,,, 


© is an abelian group. 


M@ EXAMPLE 7.1.18 


The structures (wt, @,+), (Z,1,-), and (R, 1,-) are not groups, but (Z,0, +), 
(Q, 0, +), (R,0, +), (Q \ {0}, 1,-), and (R \ {0}, 1,-) are. These are examples 
of infinite groups. When the group’s set is finite, we use the term order to refer 
to its cardinality. The group G = ({e}, e, *), where « = {((e,€), €)}, is the only 
group of order 1 (Example 7.1.15). This means that any other group with one 
element, such as ({0},0, +) or ({1}, 1,-), has the same structure as ©. We say 
that these three groups are isomorphic. Any two groups of order 2 will be iso- 
morphic, and any groups of order 3 will also be isomorphic. There are essentially 
two groups of order 4, one being the Klein-4 group (Figure 7.1) and the other 
being the group of Example 7.1.17 when n = 4. 


Section 7.1. FIRST-ORDER SEMANTICS 345 


M@ EXAMPLE 7.1.19 


All of the examples of groups given so far have been abelian. Here is one that is 
not. For all n € Z*, define M,,(R) as the set of n x n matrices with real entries. 
In other words, each matrix has n rows, n columns, and looks like 


a1 1n 
Gmn,1 Amn 
where a,j, € R fori =1,...,n andj =1,...,n. As an example, 
1). °2).°:3 
4 5 6])€M;(R). 
7 8 9 
Define matrix multiplication for 2 x 2 matrices by 


2 
Qyy 2] fOr, Oia) — 41811 + 412421 411512 + 41,2622 
431 472] |b21 522 71511 + 477691 9,119 + 47,2692 


For example, 


1 2 0 -!l 4 1 
; | ; E 1 =. i ip Oy) 
This multiplication is not commutative because 
0 -!l 1 2 —3 -4 
k hale l-(5 ae 38) 
but it is associative [Exercise 12(a)]. The identity matrix for M,(R) is 
1 O 
L = lo i . 


Notice that I, is the multiplicative identity. If A € M,,(R), then A is invertible 
if there exists B € M,(R) such that AB = BA =I, Forn = 2, 


So 


is not. All of this can be generalized to any n X n matrix. Finally, define 


is invertible, but 


M7(R) = {AE€EM,,(R): A is invertible}. 


Let GL(n, R) denote the group (M7(R),I 
group of degree n. 


-). This is called the general linear 


n? 


346 Chapter 7 MODELS 


Consequence 


We now generalize the notion of logical implication (Definition 1.2.2) to the theory of 
models. 


@ DEFINITION 7.1.20 


Let F be aset of S-formulas. An S-formula p is an S-consequence of ¥ (written 
as # F p) when & F F [I] implies WF p [J] for any S-model (22, J). If 
{p} F q, simply write p F q. 


Definition 7.1.20 implies that if the S-formula q is not an S-consequence of F, there 
exists an S-structure 2{ with interpretation J such that & EF F [J] but WK q [J]. For 
example, define 

p:=VxVy(x+y=ytx). 


We know that GL(n, R) is a group, but under any GR-interpretation J of GL(n, R), 
GL(n,R) ¥ p UW] 


because of (7.17) and (7.18). Therefore, p is not an S-consequence of the group axioms 
(7.1.14). In other words, not all groups are abelian groups. 


M@ EXAMPLE 7.1.21 


Suppose that 
F = (VxVwW(xt+y=yt+x), Vxdy(x + y = O)}. 
Let (2, 1) be an NT-model of F. Let A = dom(2Q). Then, 
WE VxAy(x + y = 0) [7], 

so by Theorem 7.1.7, 

for allu € A, WE Ay(x + y = 0) [LY], 
which by Definition 7.1.6 implies that 

there exists v € A such that for allu € A, WE x+y=0 (Uy). 

Hence, for arbitrary a (UD) and particular b (ED) in A, 

IHC), TOYO) = 930). 


However, since 
WE VxVy(x+y=yt+x) [LT], 


we find that 


(LAM IDC), LYM) = LYHCADO), LDC), 


Section 7.1. FIRST-ORDER SEMANTICS 347 
so, 
b b b b 
(4159), LOD) = 1)%(0). 
Therefore, by EG and UG, 
there exists v € A such that for allu € A, I(+)(U(v), [(u)) = (0), 


and we can reverse the steps above to find that 
WE AxVyW(yt+ x = 0) LW], 


so we conclude that 
F FE AXVW(y+x=0). 


We say that an S-sentence p is valid if @ F p and write F p. This means that if an S- 
sentence p is valid, every S-structure is a model of p since every S-structure is a model 
of the empty set using any S-interpretation (Exercise 7). For example, Vx(x = x) and 
PY —P are valid. 

We now connect the notions of consequence and satisfaction. 


M@ THEOREM 7.1.22 


Let F be a set of S-formulas and p be an S-formula. Then, 
F © pif and only if not Sats F U {7p}. 


PROOF 
The following are equivalent: 


©FEp. 

¢ For every S-model (2f, J), if A&E F [1], then WE p [/]. 

e There does not exist an S-model (2f, J) so that 20 F [I] and WK p [J]. 
e There does not exist an S-model (2, J) so that WE F [1] and WE =p [J]. 
e There does not exist an S-model (2, J) such that A&E F vu {7p} [7]. 

¢ Not Sats F U {7p}. 


M@ EXAMPLE 7.1.23 


Since Zorn’s lemma was proved from the axioms of ZFC (Theorem 5.1.13), as 
in Example 7.1.17, we can use the work specific to the proof of Zorn’s lemma to 
conclude that ZFC F Zorn’s lemma, so by Theorem 7.1.22, there is no ST-model 
that satisfies ZFC and the negation of Zorn’s lemma. 


348 Chapter 7 MODELS 


Mi EXAMPLE 7.1.24 


The S-formula p is valid if ap is not S-satisfiable. To see this, suppose that p is 
not valid. This means that there is an S-model (2, J) so that 2 F p [I]. Hence, 
by Definition 7.1.6, 2 F ap [I]. Therefore, Sat{p}. That the converse is true 
is Exercise 19. 


Compare the next definition with Definition 1.3.1. 


M@ DEFINITION 7.1.25 


Let p and q be S-formulas. Then, p is logically equivalent to g means p F q if 
and only if q F p. 


Notice Definition 7.1.25 implies that the formulas p and q are logically equivalent if and 
only if F (p © q). For example, by De Morgan’s law, =(p A q) is logically equivalent 
to ap V 7q, and by QN, we conclude that =Vxp(x) is logically equivalent to 4x-p(x). 


Coincidence 

Let 2& = (Z, a) and B = (Z, b) be GR-structures such that 
a(e) = b(e) = 0, 
a(o) = b(o) = +. 

Let J be an GR-interpretation of 2f and J be a GR-interpretation of B such that 
I(x) = J(x) = 3. 


Other assignments of these functions are not identified. Consider the following deduc- 
tion: 


—34+3=0. 
n+3=0 for somen € Z. 

TQ) + Ty) = Ty(e) for some n € Z. 
Ty(y ox) = Ty(e) for some n € Z. 
AWEyox=e [Ty] for some n € Z. 


Therefore, 
WE Ay(yox =e) [J]. 


By replacing J with J and 2 with B in the deduction, we conclude that 
BE Ay(yox =e) [J]. 


Since I and J agree on their interpretations of e, 0, and x, it is not surprising that 
they should agree on their interpretation of any {e, o}-formula with x as its only free 
variable. The generalization of this to terms and formulas is the next two results. 


Section 7.1. FIRST-ORDER SEMANTICS 349 


Hi LEMMA 7.1.26 [Coincidence for Terms] 


Let S and T be sets of theory symbols. Let 2£ = (A, a) be an S-structure and 
8 = (B,b) be a T-structure such that A = B. Let I be an interpretation of 2 
and J be an interpretation of 8. If J | VAR = J | VAR and a(u) = b(w) for all 
u€SnT, then I(t) = J(t) for every (SN T)-term t. 


PROOF 
By induction on (S N T)-terms. 


¢ Let x be a variable symbol. Then, I(x) = J(x) by hypothesis. 


e Let c be a constant symbol in SMT. We have 
I(c) = a(c) = b(c) = J(c). 
¢ Suppose [(t;) = J(¢;) for all (SMT)-terms ¢; with i = 0,1,...,n—1. Then, 


(fo. ty + str-v)) = a AU (to), 1), --- LGn-1)) 
= a(f)(J (to), J), ---, SGn-1) 
= b(f)JI (to), I(t), ---» Jn) 
= I(f (tgs tys ve sty_))- Ml 


LEMMA 7.1.27 [Coincidence for Formulas] 


Let S and T be sets of theory symbols. Let 2£ = (A, a) be an S-structure and 
8 = (B,b) be a T-structure such that A = B. Let J be an interpretation of 2f and 
J be an interpretation of B. If J [ VAR = J | VAR and a(u) = b(u) for every 
uESnT, then WF p [J] if and only if B FE p [J] for all (S nN T)-formulas p. 


PROOF 
By induction on (S N T)-formulas. 


¢ Let fp and t, be (SN T)-terms. Then, by Lemma 7.1.26, 


¢ Let fo, t),...,¢,_ be (SMT)-terms and R arelation symbol of SNT. Then, 
by Lemma 7.1.26, 
WE Rito, ty, ---+ty-1) LW] & Ut), 1), ---, 1,1) € a(R) 
> (I(t), F(t), --- J Gp-1)) € a(R) 
= (I(t), S(t), ---. J Gn) € BCR) 
& BE R(to, ty, ---5t,_1) [J]. 
Now let p be an (S N T)-formula. 


“AE ap SAKpl)]SeBFplJ]eBeE-pl[). 


350 Chapter 7 MODELS 


e Assume that 2f F p [I] implies 2f F q [I]. Also, suppose that 8 F p [J]. 
Then, 2% F p [J] by induction, so & F q [J]. Thus, B F q [J] by 
induction. The converse is proved similarly, so we have 

WE pq) if WE p [I] then AE gq [J] 
= if BE p [J] then BE q [J] 
©2BEpog{. 


e Note that for all b € A, 
b _ 7b 
IT. | VAR= J? ; VAR 
because 
P@=v= 223) 


and if y # x, 
Ey) = 19) = Jy) = F20). 


Therefore, by induction and since A = B, 
WE Axp [I] > WE p [1%] for some a € A 


= AE p[J%] for some a € B 
© AWE Axp [J]. 0 


M@ EXAMPLE 7.1.28 


Define the sets of theory symbols S = {0,1,+,-,<} and T = {0, 1,4, *, >}. Let 
2 = (w, a) be an S-structure and 8 = (a, 6) be a T-structure where 


a(0) = b(O) = ©, 
a(1) = b(1) = {2}, 
a(+) = b(+). 


Notice, for example, that under the right interpretation, 2{ could be a model of 
Vx(x - 1 = x) since it is an S-formula, but it does not make sense for 8 to be 
a model of the same sentence because Vx(x - 1 = x) is not a T-formula. Now, 
let J be an S-interpretation of 2( and J be a T-interpretation of 8 such that they 
agree on all variable symbols. Since we have the hypotheses of Lemma 7.1.27 
satisfied, let us confirm the lemma. Consider the (SN T)-formula Vx(x + 0 = x). 
Assume 
WE Vx(x +0 = x) [7], 


so 
WE (x +0 =x) 7") foralln eo. 


This implies that 


Section 7.1. FIRST-ORDER SEMANTICS 351 


a(+)U2 (x), £2(0)) = I(x) for all n € w. 
Since I and J agree on all variable symbols, a(0) = 6(0), and a(+) = b(+), 

B(+)(J2 (x), J2(0)) = F(x) for all n € @. 
Therefore, 

BE (n+0=a) [J%) foralln € a, 
which gives 
BE Vx(x+0=x) [J]. 

The purpose of the coincidence lemmas (7.1.26 and 7.1.27) is to minimize the use 


of interpretation functions, especially when modeling sentences. 


@ LEMMA 7.1.29 
Let Yf be an S-structure. If p is an S-sentence, 
ME p [J] if and only if WE p [J] 
for all S-interpretations J and J of 2. 


PROOF 
Suppose that J and J are S-interpretations of 2f. Let 2 F p [J]. Since p has no 
free variables, 2( F p [J] by the proof of Lemma 7.1.27. Hl 


Lemma 7.1.29 implies that any interpretation will do when modeling sentences. There- 
fore, we make the next definition. 


@ DEFINITION 7.1.30 


For any S-sentence p and S-structure 2 = (A, a), write 2 F p if 2 F p [I] for 
all S-interpretations J of 2. 


Lemma 7.1.29 can be used to quickly prove the next result. 


@ THEOREM 7.1.31 


For any S-structure 2 and S-sentence p, 
2 E p if and only if 2 F p [J] for some S-interpretation I of 2. 
Therefore, letting 8 be the GR-structure of Example 7.1.12, by (7.2), we have that 
BE Vxdy(xo y=0). 


The coincidence lemmas (7.1.26 and 7.1.27) also minimize the use of the sets of 
theory symbols. Consider the following. 


352 Chapter 7 MODELS 


M@ THEOREM 7.1.32 


Let S C T be theory symbol sets. If F is a set of S-formulas, ¥ is S-satisfiable 
if and only if F is T-satisfiable. 


PROOF 


T. 


Let ¥ be a set of of S-formulas. First, suppose that (A,a) F F¥ [I], where 
dom(a) = S and dom(/) = TERMS(S). Let a’ be an extension of a to T and I’ 
be an extension of I such that dom(/’) = TERMS(T). Notice that this implies 
that a and a’ agree on S = SNT. Therefore, (A, a’) F F¥ [I] by Lemma 7.1.27. 

Conversely, assume that (A,a) F F¥ [J] such that both dom(a) = T and 
dom(I) = TERMS(T). Let a’ = a | S and I’ = TM TERMS(S). This implies that 
(A, a’) FE F [I'] by Lemma 7.1.27. Hf 


There is terminology to name the relationship between the structures found in the 
proof of Theorem 7.1.32. Let the theory symbols S be a subset of the theory symbols 
Let 2{ = (A, a) be an S-structure and 2{’(A’, a’) be an T-structure. If A = A’ and 
a = a’ |S, we call 2’ an expansion of 2{ and 2% a reduct of 2’. Hence, in the first 
part of the proof, we started with a structure and then moved to an expansion, and in 


the second part, we started with a structure and then moved to a reduct. 


Theorems 7.1.31 and 7.1.32 motivate the next two definitions. 


DEFINITION 7.1.33 


Let S € T be sets of theory symbols. An S-formula p is satisfiable (denoted by 
Sat p) if there exists an T-structure that is a model for p. The set of S-formulas 
F is satisfiable (denoted by Sat #) if there exists an T-structure that is a model 
for F. 

DEFINITION 7.1.34 


Let S C T be sets of theory symbols. Assume that 7 is a set of S-sentences. An 
S-sentence p is a consequence of Y (denoted by J F p) when X& F F implies 
2 FE p for every T-structure 2. 


EXAMPLE 7.1.35 


The group axioms state that in a group there is an identity and there are inverses. 
Based on what we know about the integers, we should be able to prove more 
about these elements. For example, we expect that in a group, both 


there is exactly one identity 
and 
every element has a unique inverse 


are true. The uniqueness of the identity is left to Exercise 20. To show the unique- 
ness of inverses, let G@ = (G,e,0) be a group, and take a € G. Suppose that 


Section 7.1. FIRST-ORDER SEMANTICS 353 


a’,a" © Gand are inverses of a. Then, 


a’ =a! ce =a! 0(aca") =(a ca)oa" =eoa" =a". 


Therefore, the uniqueness of inverses is a consequence (Definition 7.1.34) of the 
group axioms. 


Rings 


Consider the equation 2x + 1 = 0. The exact steps needed to find its solution are 


(2x+1)+-1=04+-1, 
2x+(1+-1)=0+-1, 
2x+0=0+-1, 
2x =-l, 
1/2(2x) = 1/2(-1), 

(1/2 -2)x = -1/2, 

1x = -1/2, 

x=-1/2. 


Now examine the steps. There are two operations, addition and multiplication. We 
used inverses and identities. The associative law was also used. When studying these 
steps, we realize that they cannot be performed within (Z, 0, +) even though the ini- 
tial equation had only integer coefficients. This means that the group idea needs to be 
expanded. This is done by including two symbols to represent addition and multipli- 
cation. Since these two operations can have their own identities, replace e with © to 
represent the additive identity. The ideas behind the group axioms are then extended 
using RI-sentences. 


HM AXIOMS 7.1.36 [Ring] 


¢ RI. VxVyVz[x 6 (v8 z) =(x@ y) @z|] 
VxVyVz [x @ (y @ Z) = (x @ y) @ Z] 


e R2. VxVy(x @ y= y@x) 
e R3. Vx @ x =x) 
e R4. Vxdy(x @ y= O) 


e RS. VxVyVz [x @(vBz)=xBy@x@z!] 
VxVyVz[(x By @z=x@z@y@z] 


354 Chapter 7 MODELS 


HM DEFINITION 7.1.37 


An RI-structure ® = (R,0,+, -) that models the ring axioms is called a ring. If 
there exists an multiplicative identity in R, then R is a ring with unity. 


The additive inverse of a is —a, and the multiplicative inverse of a is a~! assuming that 


a #0. We usually write a—b instead of a+(—d). Notice that if R = (R, 0, +, -)is aring, 
its reduct (R, 0, +) is a group. Also, letting + and - denote addition and multiplication 
on Z, 

3 = (Z,0,+,-) 


is a ring with unity. Also, Q, IR, and C are the domains of rings with unity using the 
typical operations of addition and multiplication. 


M@ EXAMPLE 7.1.38 


Axioms 7.1.36 do not require that the ring multiplication be commutative. Let 
R =(R,0,+4+,-). Then, R is a commutative ring. if 


RE VxVy(x @y = y@x). 


e Let + and - denote standard addition and multiplication on Z. Letn € Z. 
We conclude that G = (nZ,0,+,-) is a commutative ring. It is without 
unity ifn 4 +1. 


¢ Take [a],,,[b],, € Z,, and define + as in Example 7.1.17 and multiplication 
defined by 
[a], . [b], = [ab], 5 


Then, & = (Z,, [0],,, +, -) is a commutative ring (Exercise 29). 


Axioms 7.1.36 also do not state that when the additive identity is multiplied by 
any element of the ring, the result is the additive identity. It is not among the 
axioms because it can be proved. Take a € R. By R3, 0 +0 = 0, so by RS, 


0-a=(0+0)-a=0-a+0-a. 
By R4 and since + is a binary operation, 
0-a+—-(-a)=(0-a+0-a)+ -(0-a). 
Because of R1, 
0-a+—-(0-a)=0-a+[0-a+-(0-a)]. 
Hence, 0 = 0-a+0, which implies that 0 = 0 - a. Therefore, 
RE VxX(O =O @x), 


and Vx(O = O @ x) is a consequence (Definition 7.1.34) of the ring axioms. 


Section 7.1. FIRST-ORDER SEMANTICS 355 


M@ EXAMPLE 7.1.39 


Let n € Zt, define matrix addition on M,,(IR) entrywise. For instance, 


i Oe a 1 0 8] [2 2 9 
8 Ae Al eNO sh Olas a1 <4), 
5 6 O| Jo -2 3] [5 4 3 


Let 0,, be the zero matrix. It is the n x n matrix with all of its entries equal to 


0. As in Example 7.1.19, let - represent matrix multiplication and J,, the identity 
matrix. Prove that 


M,(R) = (M,(R), 0, +, °) 
is aring. 


¢ To see that matrix addition is associative, we rely on the fact that standard 
addition of real numbers is associative. Take three matrices from M,(R) 


and add: 
ee sia + (i Se a: & cial) 
491 929 by | by Ca1  ©2,2 
= |e 1 4%, 2+ ie +ey, Oot 3 


41 422 bop +21 by9 4+ C22 


4a titer) 12+ i241) 
ay, + (by, +021) 22 + (bo2 + €22) 


(Gita dtc Gi2t+bi2)+e12 
(ay, + by )+Cy1 (dy 2 +by9) +2 


(41411 2442 aa tyke 
a3, +by1 a77+ by C21 C22 


—{ {411 412 ra (ge bio Set! 12 
421 4922 wie by C21 ©2,2 


e Since 


the zero matrix is the additive identity. 


¢ To prove that every element of M,(R) has an additive inverse, take 


A= [2 2] ema, 


421 4922 


356 Chapter 7 MODELS 


Then, 


because A + (—A) = 05. Generalizing, conclude that 
MR) F {G1, G2, G3}, 
making the GR-structure (M,,(R), 0,, +) a group. 


Matrix addition is commutative because addition on R is commutative. 
Therefore, (M,,(R), 0,,, +) is an abelian group. 


Matrix multiplication is associative. 


Lastly, to show that the operations are distributive, we must show for all 
A, B,C € M,(R), 

A(B+C)= AB+ AC 
and 

(A+ B)C = AC + BC. 


Therefore, ,,(R) F {R1, R2, R3, R4, R5}, so Mi,(R) is a ring. 


Since I,, is the multiplicative identity, R is a ring with unity, and since 
matrix multiplication is not commutative, R is a noncommutative ring. 
This proves that VxVy(x @ y = y ®@ x) is not a consequence of the ring 
axioms. 


M@ EXAMPLE 7.1.40 


If R is the domain of a ring, a,b € R \ {0} are zero divisors of the ring means 
that a-b = 0. Defining addition and multiplication coordinatewise (Exercise 18), 
the ring (Z x Z, (0, 0), +, -) has zero divisors such as 


(1,0) - (0, 1) = (0,0). 


Other examples can be found in M,(IR) where 
1 1 ; 1 1}  |0 O 
0 0 -1 -1]~ |0 O}° 
1 1 ; 1 1) _ 4] 1 1 
—-1 -l 0 O;” J-1 —-Il]’ 


showing that an element can be a left zero divisor but not a right zero divisor. 
This situation is common for rings where multiplication is not commutative. We 
do, however, have many rings that do not have zero divisors. An integral domain 
is a commutative ring with unity that does not have zero divisors. The rings 
(Z, 0, +, -), (Q,0, +, -), (R,0, +, -), and (C, 0, +, -) are integral domains. 


However, 


Section 7.1. FIRST-ORDER SEMANTICS 357 


The equation 2x + 1 = 0 is written with elements of Z and the operations of regular 
addition and multiplication. Although Z has no zero divisors, there is no integer that is a 
solution to this equation. To solve the equation, we need the existence of multiplicative 
inverses. Let (R,0,+,-) be a ring with unity. If u € R has the property that there 
exists v € R such that u-v = v-u = 1, then uw is called a unit. Notice that units 
are multiplicative inverses of each other. With this terminology, we make the next 
definition. 


Hi DEFINITION 7.1.41 
Let (R, 0, +, -) be a ring with unity. 


¢ If all nonzero elements of R are units, R is called a division ring or some- 
times a skew field. 


¢ A commutative division ring is called a field. 


The reason that the equation on page 353 can be solved the way it was is that R with 
addition and multiplication form a field. 


M@ EXAMPLE 7.1.42 


While Z is not a field with standard addition and multiplication, Q, R, and C are. 
A more interesting structure is (Z,, [0],, +, -) when p is a prime. To prove that it 
is a field, let la], E€ Zz, so that [a], # [0],- We must find an element of Zy so 
that when it is multiplied with [a], the result is [1],- Since [a], # [0], p does 
not divide a. Hence, p and a are relatively prime, so there are integers u and v 
such that ua + vp = 1. We are then able to calculate: 


@ EXAMPLE 7.1.43 


Let ® be a division ring and take u and v to be elements of the domain of R. Let 
1 be unity. Assume wv = 0 and u # 0. Then, u-! exists, and we can calculate 


u-'(uv) = u7!0, 


(u-!u)v = 0, 
lv=0, 
v=0. 


Therefore, R has no zero divisors. 


358 Chapter 7 MODELS 


Exercises 
1. Prove the remaining parts of Theorem 7.1.7. 


2. Let = be a linear order on a nonempty set A. Let R be a binary relation symbol. 
Define the { R}-structure 21 = (A, =<). Let I be an S-interpretation of 2{ such that 
I(R) = =. Prove the following. 

(a) &F Ax(xRx) LT]. 

(b) 2 EF VxVy(xRy V yRx) [T]. 

(c) WE VxVy(xRy A yRx > x = y) [J]. 


3. Let 8 be the NT-structure (Z, 08, 13, +2, Oy, where 0® and 1® are the numbers 0 
and 1 in Z while +® and -® are the standard operations of addition and multiplication 
of integers. Let J be a NT-interpretation such that I(x) = 2 and I(y) = —2. Prove the 
following. 

(a) BEx+y=0[J]. 

(b) BE Ax([x + 1] +1 =0) [J]. 

(c) BE AxVy(x-y = y) I. 

(d) BE VxVW27z=0Ax-z=y-z>7x=y)[T]. 


4. Show that 20 Vx[(x + x5) +xg =x + (x5 + xg)], where 2 is the NT-structure of 
Example 7.1.10. 


5. Find a set of theory symbols S, an S-structure 2f, and an S-interpretation J such that 
2 FE p [I] for each given formula p. 

(a) x+y=({(14+1])4+)D+4+1 

(b) x/y+z=10 

(c) Axdy(x<yAx+1l=y) 

(d) VxVyVz(xRy A yRz > zRx) 

(e) VxVy(x-y=0>x=0Vy=0) 


6. For each formula in Exercise 5, find a model (2, J) such that 2f F p [I] for each 
given formula p. 


7. Prove that every S-structure is a model of the empty set. 
8. Let A be a set. Is (P(A), 2,M) a group? Explain. 


9. Explain why (Z*,0, +), (Z, 1, -), and (R, 1, -) are not groups, where the operations 
are the standard ones. 


10. Suppose that * is an operation on Z defined by x * y=x+y+42. 
(a) Identify the identity € and the inverses with respect to x. 
(b) Prove that (Z, €, *) is a group, where ¢ is the identity found in Exercise 10(a). 
(c) Solve 8 « x = 10. 


11. Let 0 represent the zero function R — R and + be function addition. That is, For 
allx € R, 


(F + sx) = FX) + B(x). 


Section 7.1. FIRST-ORDER SEMANTICS 359 


(a) Prove that (RR, 0, +) is a group. 

(b) Is™mR the domain of a group where the binary operation is function division? 
If so, what is its identity? 

(c) Is®R the domain of a group where the binary operation is composition? If 
so, what is identity? 

12. Let n be a positive integer. 
(a) Prove that matrix multiplication is associative. 
(b) Solve the equation in M,(R): 


1 4 4/2 b| _ |-3 8 
-—3 0 ec d| | 0 -6|° 
(c) Show that M,,(IR) is not the domain of a group under matrix multiplication. 


13. Let (G,e, *) and (G’, €’, x’) be two groups. For all a, b € G and a’, b’ € G’ define 
(a, a’) + (b, b') = (a * ba’ *' Db’). 
(a) Confirm that - is a binary operation on G x G’. 
(b) Show that (G x G’, (e, €’), :) is a group. Prove that it is abelian if and only if 
both of the given groups are abelian. 


14. Let n be an integer. Prove that (nZ, 0, +, -) is a commutative ring. 


15. Why is 
{|° | abode Z* } 
c od 


not the domain of a ring under the standard matrix operations? 


(sees) 


is the domain of a ring with the standard matrix operations. 


16. Prove that the set 


17. Both + (function addition) and o (composition) are binary operations on RR, but 
(®R,0,+, 0) is not a ring. Identify which ring axioms fail. 


18. Let (R,0,+,-) and (R’,0’, +’, -’) be rings. Define addition and multiplication on 
RX R’ so that for all (a, b), (c,d) € RX R’, 


(a,b) + (c,d) =(at+c,b+' d), 
and 
(a,b) - (c,d) =(a-c,b-'d). 
Prove that (R x R’, (0, 0), +, -) is a ring. 
19. Prove the converse of Example 7.1.24. 
20. Prove that 
{G1, G2, G3} F VxVy[Vz(x 0 z= ZAZ0Ox =x) 
AVZ(yoZ=ZAZOY=Z)>x=yI]. 


360 Chapter 7 MODELS 


21. Let G = (G,e, *) be a group so that 
GE VxVy[(ao b)o(aocb)=acoaobobl. 


Show that G is abelian. 
22. Prove that Vx(O @ x = O) is aconsequence of the ring axioms. 


23. Let — be a unary function symbol. Define RI’ = RI U {—}. Show that the given 
sentences are consequences of the ring axioms and Vx(—x @ x = ©). 

(a) VxVy[-(x @ y) =-x @VA-x @y=x®@-y] 

(b) VxVy(—a @ —b=a@b) 

(c) VxV[-(a @ b) = —a @ -5] 

(J) -O=O 
24. This exercise uses the notation of Exercise 23. Let be a ring with unity. Let R’ 
be the expansion of R to RI’ = RIU {—}. Assume that for all r € dom(®’), 


—*(r) B® r= O*. 
Prove that R’ F Vx[Vy(x @y=yAy@x=y) > Vz(-z = —-x @z)]. 
25. Let p and q be S-formulas. Prove that the given S-sentences are valid. 
(a) pV 7p 
(bt) p>oqen7pvg 


(c) dx(pV q) @ Axp V Axq 
(d) Vx(pA q) @ Vxp A Vxq 


26. Let R be a binary relation symbols and f be a binary function symbol. Show that 
the given sentences are satisfiable. 

(a) Ax(x = x) 

(b) Axdydz(7-x = yArx =ZA7y= 2) 

(c) AxVy(RxyVx=y) 

(d) VxVy(fxy = 1) 

(e) VxVy[Rxy — 4z(Rxz A Rzy)] 
27. Let S and T be sets of theory symbols such that S € T. Let f be a reduct to S of 
the T-structure 8. Prove that 2{ F p if and only if 8 F p for all S-sentences p. 


28. Suppose that po, pj,.-.,P,—1 are S-sentences. For every S-structure Wf, prove that 
WE po A py A+++ A p,_y if and only if WF p; for alli =0,1,...,n-1. 


29. Answer the following about (Z,,, [0],,, +, +): 
(a) Prove that addition and multiplication of congruence classes is well-defined. 
(b) Show that the additive identity is [0],,. 
(c) For all a € Z, show that —[a],, = [n —al],. 
(d) Show that [1],, is the multiplicative identity. 
(e) Prove that (Z,,, [0],,,+, -) is a commutative ring. 
(f) Prove that the ring contains zero divisors when vn is not prime. 


Section 7.2. SUBSTRUCTURES 361 


30. Prove that VxVwVz(x @ y= x ®z > y= Z) is a consequence of the ring axioms. 


31. Let ® be an integral domain. Prove the following. 
(a) REVxVy(x@y=Or7x=OVyY=0). 
(b) RE VxVWzx@y=x@zAx#O-y=2Z). 


32. Suppose that R = (R,0,+,-) is a commutative ring with unity. Show that if 
RE VxVysz(x @ z @ y= O), then R a field. 


33. Is ({0},0,+, -) a field? Explain. 


7.2. SUBSTRUCTURES 


When looking for examples of groups, the GR-structure (Z,0,+) is often the first to 
come to mind. The benefit of this example is that not only are we familiar with the 
integers but it has the property that many of its subsets also form groups. Let n € Z. 
Addition on Z restricted to nZ X nZ is an associative binary operation on nZ, every 
element of nZ has an additive inverse in nZ, and 0 € nZ, so the GR-structure (nZ, 0, +) 
is a group. Since n # +1 implies that nZ C Z, there are infinitely many different 
examples of GR-structures, all within (Z,0,+). We generalize this idea to arbitrary 
structures. 


@ DEFINITION 7.2.1 


If 2¢ = (A, a) and 8 = (B, b) are S-structures, 2f is a substructure of 8 (written 
as 2 C B) means that A C B and the following properties hold. 


¢ a(c) = b(c) for all constant symbols c. 

e a(R) = b(R) Nn A” for every n-ary relation symbol R. 

e a(f) = b(f) | A” for every n-ary function symbol /. 
If 2 is a substructure of B, then B is an extension of 2. 


Note the difference between a substructure and a reduct and between an extension 
and an expansion (page 352). For all n € Z, the group (nZ, 0, +) is a substructure 
of (Z,0,+), and (Z,0,+) is an extension of (nZ,0,+). Here both structures have the 
same set of theory symbols, and the domain of one is a subset of the other. However, 
(Z, 0, +) is a reduct of (Z, 0, 1,+,-), and (Z, 0, 1, +, -) is an expansion of (Z,0,+). In 
this case, the domains are the same, but the theory symbol set of the one is a subset of 
the theory symbol set of the other. 


@ EXAMPLE 7.2.2 


Let R be a binary relation symbol. Let 2 = ((0,1],a) and B = ([0,2], 6) be 
{.R}-structures such that a(R) and b(R) are both standard less-than. That is, 


a(R) = {((xy, y)ERXR:0<x<y<l} 


362 Chapter 7 MODELS 
and 
B(R) = {(, y)ERXR:0<x<yK<2}. 
We conclude that 2f € B because of the following: 
¢ [0,1] ¢ [0, 2]. 
e There are no constant symbols. 
¢ a(R) = b(R) A ([0, 1] x [0, 1]). 


e There are no function symbols. 


@ EXAMPLE 7.2.3 


Let n € Z. Define the NT-structure 8 = (Z,b6), where b(0) is the additive 
identity of Z, b(1) is the multiplicative identity of Z, b(+) is standard addition 
on Z, and b(-) is standard multiplication on Z. Let 2, = (nZ, a) such that 


a(0) = b(0), 

a(1) = b(1), 

a(+) = b(+) | (nZ x nZ), 
a(-) = b(-) [| @Z x nZ). 


Then, 2 is a substructure of B. 
In particular, Example 7.2.3 gives 
Wy C Wy C A, 
which implies that fg € 5. This is a special case of the next theorem. 
@ THEOREM 7.2.4 
Let 2, B, and © be S-structures. 
© WC A. 
e If 2 C Band B CG, then ACG. 


PROOF 
That 2 is a substructure of itself is clear, so suppose that 2 is a substructure of 
8 and % is a substructure of ©. Write 2 = (A,a), B = (B,b), and € = (C,c). 
Then, for all constant symbols c, 


a(c) = b(c) = c(c). 


Since W& C B, a(R) = b(R) N A", and since B C GC, b(R) = c(R) N B” for every 
n-ary relation symbol R, so 


a(R) = c(R)N B" nN A” = c(R)N A”. 


Section 7.2 SUBSTRUCTURES 363 


Also, a(f) = b(f) | A” and b(f) = c(f) | B” for all n-ary function symbols /, 
so 


a(f) = (e(f) | BY) [ A” = c(f)[ A”. 


Therefore, 2 is a substructure of ©. i 


Subgroups 


Let a be an element of a group (G, €, *). For all positive integers n, define a” to be the 
result of operating a with itself n times. That is, 


al=a,a =a*a,a=ax*a*a,... 


and 
q”™ * q" = qutn 
Further, define a® = e and a7! to be the inverse of a. With this notation, we observe 
that 
(ax by !=b! xa! 


and 
a” = (a")7! = (a {yt 


We then gather all of these elements into a set, 
(a) = {a": ne Z}, 
and define the following. 


@ DEFINITION 7.2.5 


A group & is cyclic if there exists a € dom(@) such that dom(®) = (a). The 
element a is called a generator of ©. 


For example, (Z,0,+) is a cyclic group. Both 1 and —1 are generators. However, Q 
and R paired with addition do not form cyclic groups. As for finite groups, each Z,, is 
cyclic, generated by [1],,, but the Klein-4 group (Example 7.1.18) is not cyclic because 
a* =e for all a in the group. 

An element a of a group might not generate the entire group, but since e € (a) and 
both a” and a~ are elements of (a), the set generated by a forms a group using the 


operation from G. 


@ DEFINITION 7.2.6 


A substructure § of a group & that is a group is called a subgroup of G. 


Every group with at least two elements has at least two subgroups, itself (the im- 
proper subgroup) and the subgroup with domain {e} (the trivial subgroup). A group 
that has at most these two subgroups is called simple. For example, (Z>,0, +) and 
(Z3,0, +) form simple groups, but (Z4, 0, +) does not because it has a subgroup with do- 
main {[0]4,[2],}. Other examples of nonsimple groups are (R, 0, +) because (Z, 0, +) 


364 Chapter 7 MODELS 


is one of its subgroups and (2), 0, +) because ((6), 0, +) is one of its subgroups. These 
subgroups that are not improper are called proper. 

It is tempting to define a subgroup simply as a substructure of a group, but this would 
not work if the subgroup is to be a group. For example, viewing @ as a subset of Z via 
(5.8) allows (@, 0, +) to be a GR-substructure of (Z, 0, +), but 


(a, 0, +) F {G1, G2} 


yet 
(@, 0, +) # G3. 


This example suggests the following. 


@ THEOREM 7.2.7 
A substructure § of a group G is a subgroup of © if and only if § F G3. 
PROOF 


Write G© = (G,g) and = (H,}) and let § C G. If H is a subgroup, then 
§ & G3. To prove the converse, assume § F G3. 


e Let x, y,z © H. Since h(o) = g(o) | (H x A), 
H(o)(x, H(O)(y, Z)) = a(0)(x, (0), 2) 
= g(0)(g(0)(x, y), Z) 
= h(0)(H(0)(x, y), Z). 


The second equality holds because the interpretation of o in G is associa- 
tive. 


¢ Let x € H. Because h(e) = g(e), 
H(o)(Hle), x) = g(e)(gle), x) = x 


and 


H(o)(x, H(e)) = g(0)(x, g(e)) = x. 
Therefore, § is a group and, thus, a subgroup of ©. 


The standard way to show that a subset of a group forms a subgroup is not to show 
directly that the set satisfies the three group axioms or to appeal to Theorem 7.2.7. 
Instead, what is typically done in algebra is to check that the conditions of the next 
theorem are satisfied by the set. 

@ THEOREM 7.2.8 
If G = (G, e, *) is a group and H C G, there exists a subgroup of © with domain 
A if 
e HT is closed under *, 


ce E€dH, 


ea! eH forallae H. 


Section 7.2 SUBSTRUCTURES 365 


PROOF 
Suppose that the three hypotheses of the theorem hold. 


e Let a,b € H. By the first hypothesis, a « b € H. Therefore, « |(H x H) 
is a binary operation on H. 


e Since * is associative on G, the restriction of * to H must be associative. 
e The second hypothesis gives H an identity element. 


e Every element of H has its inverse in H by the third hypothesis. 


Therefore, (H,e, * | LH x H]) isa group. Since H C dom(@), we conclude that 
(H,e¢,* | LH x H)) is a subgroup of G. Hf 


@ EXAMPLE 7.2.9 


To illustrate the theorem, take a group © = (G,¢, *) and a family of subgroups 
(H,,€, * | LH; x H,]) for alli © I. Although the union of subgroups might not 
be a subgroup (Exercise 9), we can show that 


(| | H,e.«1 | ) H; xf | 4.) 
iel iel iel 
is a subgroup of G. 


¢ By Exercise 3.4.22(b), (),-, H; € G. 


iel 


¢ Let a,b € ();<, Hj. This means that a,b € H; for alli € I. Since each 
H, is closed under the operation of G, a « b € H; for alli € I. Hence, 


axbe () Hi. 


iel 
* Since e € H, for every i € I, we must have € € (),<, Hj. 


* Take a to be an element of ();<, H;. Then, a! € H, for alli € I, so 
a Ef) je, Hi. 
Now we return to cyclic groups. 


M@ THEOREM 7.2.10 


A subgroup of acyclic group is cyclic. 


366 Chapter 7 MODELS 


PROOF 

Let G = (G,e,*) be a cyclic group with generator a. Let § = (H,e,*) be a 
subgroup of &. If § is the trivial subgroup, the subgroup is cyclic with generator 
e. So suppose that § is not the trivial subgroup. Because G is cyclic, there exists 
a least natural number n > 0 such that a” € H (Theorem 5.2.13). Suppose that 
a" is not a generator of . This means that there exists m € w with m > n 
such that a” © H but a” ¢ (a"). This combined with the division algorithm 
(Theorem 4.3.31) yields unique natural numbers g and r such that m = nqg+r 
and 0 <r <n. Therefore, 


a” = qtr 


=a" xa’, 
and from this, we conclude that 
a=a™% xq”. 


Since a~"4, a” € H, a’ is an element of H. This contradicts the minimality of n 
because r < n. Thus, a” is a generator of 5. Hf 


Subrings 


Some of the examples of rings had domains that were subsets of other rings. For ex- 
ample, nZ is a subset of Z, and Q is a subset of R. Generalizing leads to the next 
definition. 


@ DEFINITION 7.2.11 


A substructure © of a ring ® that is a ring is called a subring of R. 


A subring of ® such that its domain is a proper subset of the domain of R is called 
a proper subring. The ring itself is called the improper subring. The subring with 
domain {0} is the trivial subring. 


WM EXAMPLE 7.2.12 
© ({[0]o, [3]o.[6]o}, [O]o, +, -) is a subring of (Zo, [O]o, +, -). 
¢ (Z,0,+, -) is a subring of (IR, 0, +, -). 
* (M,(R),0,, +, -) is a subring of (M,(C), 05, +, :). 


As with subgroups, a substructure of a ring is not necessarily a subring, but we do 
have results similar to those for groups found in Theorems 7.2.8 and 7.2.7. They are 
stated without proof since they follow quickly from Definition 7.2.11. 


@ THEOREM 7.2.13 
A substructure S of a ring ® is a subring of R if and only if G F R4. 
We follow the convention that if 8 represents an arbitrary ring, R = (R,0,+4,-) and 


if R’ also represents an arbitrary ring, R’ = (R’, 0’, +’, -’). This will help us with our 
notation. 


Section 7.2, SUBSTRUCTURES 367 
@ THEOREM 7.2.14 
If R is aring and S C R, there exists a subring of R with domain S if 
e Sis closed under + and -, 
-O0ES, 
e —aé€S forallae R. 


The subring found while proving Theorem 7.2.14 is GS,0,+ LS x S],- | LS x S')). 


M@ EXAMPLE 7.2.15 


We use Theorem 7.2.14 to show that G = (S,0,,+][S x S],-[[LS x S]) isa 
subring of t,(IR) (Example 7.1.39), where 


sa{[t 9 :asen}. 


¢ Let a,b,a’,b’ € R, and assume that 


a 0 a 0 
a=[p | and B= | Me 


Then, 
a+a’ 0 
sacl | 0 b+ | 
and 
aa’ (OO 
AB | 0 se 


These are elements of S. 
¢ Clearly, the zero matrix is in S. (Leta = b= 0.) 


e Take a,b € R and write 


Hence, 


is an element of .S. 


368 Chapter 7 MODELS 


EXAMPLE 7.2.16 


Let G and & be subrings of a ring R. Let S be the domain of G and T be the 
domain of &. Check the conditions of Theorem 7.2.14 to show that there exists 
a subring of R with domain Sn T. 


¢ To prove closure, letx, y € SAT. This means thatx+y € Sandx+yeET. 
Hence,x+yE ST. Similarly, xy E SOT. 


e SinceOE SandOET,OE SNOT. 


¢ Suppose x € SMT. Then x € S and x € T. Since these are subrings, 
—x € Sand —-x € T. Thus, -x E SNT. 


Ideals 


The subring © of the ring R in Example 7.2.15 lacks a property that is often desirable 


to have in a subring. Observe that 
1 0 1 2 1 2 
lo 1 1. l-|5 ‘jes. 


so in general it is false that AB € S and BA € S for all A € S and B € M,(R). 


@ DEFINITION 7.2.17 


Let ® be a ring with domain R and % be a subring of R with domain J. 
e Ifra € I andar € J forallr € Randa € J, then S is an ideal of R. 
e Ifra € I forallr € Randa €é J, then G is a left ideal. 


¢ Ifar € I forallr € Randa € J, then S is a right ideal. 


A ring & is an ideal of itself, called the improper ideal of R. All other ideals of R are 
proper, including the ideal formed by {0}. Furthermore, in a commutative ring, there 
is no difference between a left and right ideal. However, if the ring is not commutative, 
a left ideal might not be a right ideal. 

M@ EXAMPLE 7.2.18 


Define 


Using matrix multiplication, 


x yl ja O] _|xa+yb 0 
: | E eee ease 


Section 7.2. SUBSTRUCTURES 369 


1 0 1 1 1 1 

F Ae = [i ier 
Therefore, the subring (7,05,+ | LI x I], + [LI x I]) of 2%,(R) is a left ideal but 
not a right ideal. 


but 


M@ EXAMPLE 7.2.19 


Let J = {[0]4,[2],}. Then, S$ = U,[0]4,+][L x L],-|[f x 1)) is a subring 
of the ring R = (Z,[0]4,+,-). It also forms an ideal. To see this, check the 
calculations: 


[O]4 - [0]4 = [0]4 [O]4 - [O]4 = [0]4, 
[1]4 -[0]4 = [O]4 [O]4 - (114 = [0]4, 
[2]4 - [0]4 = [0], [O]4 - [2]4 = [0], 
[3]q - [0], = [0], [0]4 - [3]4 = [0], 
[O]4 - [2]4 = [0]4 [2]4 - [0]4 = [0]4, 
[1]4- (214 =([2]4 [2]4- (14 = [2]4, 
[2], - [214 = [0], [2]4- [2]4 = [0],, 
[3]4 -[2]4 = [214 [2], - (3]4 = [2], 


When we multiply any element of Z, by an element of J on either side, the result 
is an element of I. 
M@ EXAMPLE 7.2.20 
Let ® be a ring. It is left to Exercise 18 to show that the ring 
G= (R x {0}, (0, 0), +’, ‘) 
is a subring of 
r= (R x R, (0, 0), F, ‘) 


with + and - defined coordinatewise and +’ and -’ being the restrictions of + and - 
to Rx {0}. To show that © is an ideal of &, let (r,s) € Rx Rand (a,0) € Rx {O}. 
We then calculate 

(r, 8): (a,0) = (ra,0) € Rx {0} 


and 
(a, 0) - (r, s) = (ar,0) € Rx {0}. 


M@ EXAMPLE 7.2.21 
Let ® be a ring. Let S and T be subsets of R such that 


GS =(S,0,+ | LS x S],- > LS x S]) 


370 Chapter 7 MODELS 


and 
¥=(T,0,+;(T x T],- (Tx T)) 


are ideals of R. Define 
S+T={stt:seSandteT}. 
We prove that S + T is the domain of an ideal of R. 


¢ Take x,y € S+T7. This means that x = s+tand y = s’ +1?’ for some 
s,s’ € S andt,t’ € T. Then, 


x+y=(st+t)+(s' +1’) 
=st(t+s’)+7' 
=st(s' +H4+?2’ 
=(sts)+(t+7’). 
Thus, x +yE€S+T sinces +s’ € Sandt+t' €T. Also, 
xy =(s+t(s' +t) =(stas' + (s+. 


Since s +t € Rand G is an ideal, (s + t)s’ € S. Likewise, (s + ft’ is an 
element of T. Hence, xyE S+T. 


¢ We know that 0 € S + T because0 =0+0and0 € S and0 €T. 
e Lets € S andt € T. Then, since —s € S and -t € f, 


-(s+hH=-s+-teES+T. 
e Letre R,s € S,andt ET. Sincers € S andrt ET, 
rstH=rst+rtEe S+T, 
and since sr € S andtr eT, 
(stir=sr+treS+T. 


Cyclic subgroups (Definition 7.2.5) are generated by a single element. The corre- 
sponding notion in rings is the following definition. 


HM DEFINITION 7.2.22 
Let ® be a ring. For every a € R, define 
(a) = {ra: re R}. 


If S is a left ideal of R such that dom(S) = (a) for some a € R, then G is called 
a principal ideal left ideal and a is a generator of S. If R is commutative, S 
is an ideal of ® called a principal ideal. 


The set nZ is the domain of an ideal of the ring (Z, 0, +, -). Itis a principal ideal because 
nZ = (n). (See Exercises 16 and 21.) In fact, every element of a ring generates a left 
ideal of the ring. 


Section 7.2 SUBSTRUCTURES 371 


@ THEOREM 7.2.23 


For every ring R anda € R, the ring S = ((a), 0, + | [(a) x (a)], - | La) x (a)]) 
is a left ideal of R. 


PROOF 
First, show that $ is a subring. 


e Let r,s € R. Then, because r + s and (ra)s are elements of dom(R), 
ra+sa=(r+s)a € (a) 
and 
(ra)(sa) = [(ra)s]a € (a). 
¢ Since 0a = 0 [Exercise 7.1.22], we have that 0 € (a). 
e Ifr € R, then (—r)a € (a) and -ra + ra = (-r + r)a = 0a = 0. 


To prove that & is a left ideal of R, take r,s € R. Then, r(sa) € (a) because 
rs © Randr(sa) = (rs)a. Hf 


Notice that a principal left ideal might not be a two-sided ideal. Example 7.2.18 com- 
bined with the next example illustrates this fact. 


@ EXAMPLE 7.2.24 
Let $ = ,0,,+ } Lx I],- {LJ x I) be a left ideal of 2,(IR), where 


re {fF J sasen} 


By Theorem 7.2.23, it is a principal left ideal of It,(IR) because 


=([o ol) 


To prove this, it suffices to take a, b € R and observe that 


allel" los o| 


implies 


372 Chapter 7 MODELS 


WM EXAMPLE 7.2.25 


Every ideal in the ring 3 = (Z,0,+, -) is principal. To see this, let S be an ideal 
of 3. We must find an element of Z that generates J = dom(%3). We have two 
cases to consider: 


+ If I = {0}, then I = (0). 


¢ Suppose I # {0}. This means that In Z+ # @. By Theorem 5.2.13 and 
Exercise 5.3.1, J must contain a minimal positive integer. Call it m. We 
claim that [ = (m). It is clear that (m) C I, so take a € I and divide it by 
m. The division algorithm (Theorem 4.3.31) gives q,r € @ so that 


a=mqtr 


with 0 < r < m. Then, r € I because r = a— mq anda,mq € I. 
If r > O, then we have a contradiction of the fact that m is the smallest 
positive integer in J. Hence, r = 0 and a = mq. This means that a € (m). 


3 is an example of a principal ideal domain, an integral domain in which every 
ideal is principal. 


Exercises 


1. Let R be a binary relation symbol. Define the {R}-structures 2% = (Q,<g) and 
B = (R,<p), where <g refers to standard less-than on © and <p refers to standard 
less-than on R. Prove that 2 is a substructure of 8. 


2. Let 2& = (A, a), B = (B, 5), and € = (C,c) be S-structures such that B C 2% and 
€ C W. Define D = (D,d) such that D = BNC, d(c) = a(c) for all constant symbols 
c € S, 0(R) = 6(R) N c(R) for all relation symbols R € S, and d(f) = b(f) N cf) for 
all function symbols f € S. Prove that D € 2. 


3. Let x be a cardinal. Define &, = (A,,a,) to be an S-structure for all a € x. The 
family {2L, : y € x} is called a chain of S-structures if for all a € 6 € x, we have that 
%, © Ay. Define the S-structure Uc, Wy, = (Uyex Ay, @) so that for every relation 
symbol R € S, 

a(R) = (J a,(R), 


YEK 


and for every function symbol f € S, 


a(f) = |Ja,(/). 
YEK 
Prove the following. 
(a) {a,(R) : y © x} is achain with respect to C for all relation symbols R € S. 
(b) rex a,(f) is a function. 
(Cc) a © Uvex , forall a € x. 


Section 7.2. SUBSTRUCTURES 373 


4. Let S C T be sets of subject symbols. Let 2f and 8 be T-structures. Prove that if 
2 Cc B, then the reduct of 2° to S is a substructure of the reduct of B to S. 


5. Find S-substructures of the given S-structures. If possible, find a S-sentence that is 
true in the given structure but not true in the substructure. 

(a) (ZUP(Z),€), S = ST 

(b) (@, 9, {S},+,-),S=NT 

(c) (C,0,+),S =GR 

(d) (R,0,+,-,<),S = OF 


6. Let G be a group with domain G. Find all subgroups of @ given the following and 
assuming the standard operations for each set. 

(a) Gis the Klein-4 group 

(b) G=Zs 

() G=Zy 

(d) G=Z,x 2, 

(ec) G=2Z,xZe 
7. Show that {f € ®R : f(O) = 0} is the domain of a subgroup of the group 
(RR, 0, +). 


8. Define H = {A EM,(R): a) +492 +++: +4, = O}. Prove that (1, 0,, +) is a 
subgroup of the group (M,,(R), 0,,, +). 


9. Let {§; : i € I} bea family of subgroups of the group G. Give an example to show 
that U,<, 9; is not necessarily the domain of a subgroup of G. 


10. Let (H,e,* | LH x H]) and (K,e,* |[K x K]) be two subgroups of an abelian 
group G = (G,e, *). Define 
AK={axb:a€HaAbe K}. 


Prove that (H K,e,* | LH K x HK)}) is a subgroup of G. 
11. For any group & = (G,e, «), let S C G and define 


H={a€G:axb=bxaforalbe S}. 


Prove that (H, e, * | LH x H]) is a subgroup of G. 

12. Demonstrate that simple groups are cyclic. 

13. Prove Theorem 7.2.14. 

14. Prove that the subrings of a ring with unity are rings with unity. 


15. Let R be a ring with domain R. Find all ideals of R given the following and 
assuming the standard operations for each set. 

(a) R=Z, 

(b) R=Z, 

(c) R=Z, 


374 Chapter 7 MODELS 


(d) R=Z)) 
16. Letn € Z. Prove that nZ is the domain of an ideal of the ring (Z, 0, +, -). 


17. Let R = (Z x Z, (0,0), +, -). 
(a) Prove that {(2m,2n) : m,n € Z} is the domain of an ideal of ®. 
(b) Prove that {(2n, 2n) : n € Z} is not the domain of an ideal of R. 
(c) Prove that Z x 3Z is the domain of a principal ideal of R. 
(d) Is {(2m, 2n) : m,n € Z} the domain of a principal ideal of R? 


18. Prove the © is a subring of & in Example 7.2.20. 


19. Let ® be a ring with unity and % an ideal of R. Let R be the domain of R and I 
be the domain of $. Prove that if u is a unit and u € J, then J = R. 


20. Prove that a field has no proper, nontrivial ideals. 


21. Prove that nZ is the domain of a principal ideal of the ring (Z,0,+,-) for any 
integer n. 


22. Show that for any ideal S, if a € dom(S), then (a) C dom(S). 


23. Take aring R and let a € dom(R). Show (a) is the domain of a left ideal of R 
but not necessarily a right ideal if R is not commutative. 


24. Let u be a unit of a ring R. Show for all a € dom(R), (a) = (ua). 


25. An ideal $ = (P,0,+, -) of a commutative ring R = (R,0,+4+, -) is prime means 
for alla,b € R, if ab € P, thena € Porb &€ P. Let p € Zt bea prime number 
(Example 2.4.18). Prove that (pZ,0,+ | pZ,- | pZ) is a prime ideal of (Z, 0,+, -). 


26. Prove that the trivial subring of an integral domain is a prime ideal. 


27. Let R be a commutative ring and MM a proper ideal of R. If no proper ideal of R 
has 3M as a proper ideal, Mt is a maximal ideal of R. Assume that p is a prime number. 
Prove that pZ is the domain of a maximal ideal of (Z, 0, +, -). 


28. Prove that the trivial ideal is the maximal ideal of a field. 


29. Let p € Z be prime. Prove that {(pa, b) : a,b € Z} is the domain of a maximal 
ideal of the ring with domain Z x Z (Example 7.2.20). 


30. Use Zorn’s lemma (Theorem 5.1.13) to prove that every commutative ring with 
unity has a maximal ideal. 


7.3. HOMOMORPHISMS 


The function f : [0, 00) — (—oo, 0] defined by f(x) = —x preserves < with > (Exam- 
ple 4.5.25). Define the {=<}-structures 2f = ([0, co), a) and 8 = ((—oo, 0], 6), where 
a(=<) = < and b(X) = >. When xo, x, € [0, 00), observe that 


(x9.1) € a(=) if and only if (f (x9), f(x1)) € B(3). 


Section 7.3 HOMOMORPHISMS 375 


because 
Xo < x, if and only if —x9 > —x}. 


Therefore, we know that =< behaves in 8 as it behaves in 2f because of f. We generalize 
this notion to all S-structures in the next definition. 


HM DEFINITION 7.3.1 


Let 2& = (A,a) and B = (B,5) be S-structures. A function g : A > Bisa 
homomorphism 2 — % if it preserves the structure of 2 in B. This means 
that ¢@ satisfies the following conditions. 


¢ ~(a(c)) = b(c) for all constant symbols c € S. 
¢ If Ris an n-ary relation symbol in S, then for all ap, a), ...,a,_; € A, 
(ao, Ql, -++5 Qn) Ee a(R) 


if and only if 
(P40), P(41), --+ P(An-1)) € BCR). 


¢ If f is an n-ary function symbol in S, for all ag, aj, ...,a,_1 € A, 
pla f(a, aj, sey ay—1)) = b(f)(p(ao), (ay), sey P(a,_})). 
When g: A > Bisa homomorphism, write g : A> B. 


There are many important examples of homomorphisms that can be found in algebra. 
For instance, let R and R’ be rings and g : R > R’ be a homomorphism. By 
Definition 7.3.1, this means that 

e(0) = 0. 


Because @ and ® are the only function symbols in RI, for all a,b € R, 
g(a + b) = g(a) +’ od) 


and 
pla: b) = g(a)" ob). 


This together with a similar analysis of homomorphisms between groups motivates the 
next definition. 


@ DEFINITION 7.3.2 


¢ A group homomorphism is a homomorphism © > @’, where & and ©’ are 
groups. 


¢ A ring homomorphism is an homomorphism R > 9’, where R and R’ are 
rings. 


Throughout this section the focus will be on ring homomorphisms. 


376 Chapter 7 MODELS 


EXAMPLE 7.3.3 


Let R and R’ be rings. The function y : R > R’ so that w(r) = 0! for allr Ee R 
is aring homomorphism. This function is called a zero map. 


M@ EXAMPLE 7.3.4 


Define the function g : Z — Z x Z by y(n) = (n,0). Assume that + and - are 
the standard operations on Z while +’ and -’ are the coordinatewise operations 
on ZX Z. Let m,n € Z. Then, 


* g(0) = (0, 0) 
° g(m +n) = (m +n,0) = (m, 0) +’ (n, 0) = ge(m) +’ p(n) 
* g(mn) = (mn, 0) = (m, 0) - (n,0) = pm) *' G(r). 
Therefore, ¢ is a ring homomorphism (Z, 0, +,:) > (Z x Z, (0,0), +’, -'). 


The homomorphism of Example 7.3.4 provides a good opportunity to clarify what 
a ring homomorphism does. Take the integers 1 and 4. Adding them together in Z 
yields 5. The images of these integers under g are g(1) = (1,0), p(4) = (4,0), and 
~(5) = (5,0). Observe that (1,0) + (4,0) = (5,0) in Z x Z, illustrating that the ring 
homomorphism ¢ preserves the addition structure of Z in Z x Z. That is, with respect 
to addition, both sets behave the same way. Multiplication also has the property. For 
example, when 3 is multiplied with 6 the result is 18 in Z, and their images yield 
~(3)p(6) = (18) in Z X Z. This is illustrated in Figure 7.2. 

Although ring homomorphisms will always preserve the additive identity by defini- 
tion, this is not a condition that needs to be checked. 


Figure 7.2. The ring homomorphism g : Z > Z x Z defined by y(n) = (n, 0). 


Section 7.3 HOMOMORPHISMS 377 


M@ THEOREM 7.3.5 


Let R and ®’ be rings. Then, g : R > R’ is aring homomorphism if and only 
if for all x,y € R, 


© p(x + y) = p(x) +’ p(y), and 
© P(x -y) = p(x)! oy). 


PROOF 
Since sufficiency is clear, assume that @ preserves addition and multiplication. 
Then, 


(0) +’ pO) = pO + 0) = vO). 
Adding —@(0) to both sides yields @(0) = 0’. Hl 


In addition to preserving operations and the additive identity, ring homomorphisms 
also preserve inverses. Remember, if g : R > R’ is a function and a € R, then —a is 
the additive inverse of a in R and —¢(a) is the additive inverse of (a) inside of R’. 


@ THEOREM 7.3.6 


Let R and ®’ be rings. If g : R > R’ is aring homomorphism, for all a € R, 
p(—a) = —9(a). 


PROOF 
If g: R > R’ is aring homomorphism, then for all x € R, 


e(x) +’ o(-x) = p(x + —x) = (0) = 0'. 


M@ EXAMPLE 7.3.7 


Check Theorem 7.3.6 using the function ¢ as defined in the Example 7.3.4. Since 
5 and —5 are additive inverses in Z, we have that 


(5,0) + (—5, 0) = (0, 0), 
so g(5) and g(—S5) are additive inverses in Z x Z. 
Ring homomorphisms also preserve subrings and ideals. 


@ THEOREM 7.3.8 


Let R and ®’ be rings and g : R > R’ be a ring homomorphism. Let S$ be an 
ideal of R with domain J and 3’ be an ideal of R’ with domain I’. 


If gy is onto, (y[J], 0’, +’ | pLZ], -’ | p[1]) is an ideal of R’. 


© §=(97![1],0,+ | gL], - | 1 LI’) is an ideal of R. 


378 Chapter 7 MODELS 


PROOF 
The first part is left to Exercise 6. To prove the second, first prove that {J is a 
subring of R. 


¢ To show closure, let x;,x. € gy '[1']. This means that g(x,) and g(x) 
are elements of I’. Hence, 


P(x +X) = G(x,) +! P(Xy) € I 


and 
A(X, + Xz) = P(Xy) A(X2) © Tes 


Therefore, x; + X7,X1-X, € g'[I’]. 
* Since 0’ € I’ and y(0) = 0’, it follows that 0 € g~![1’]. 
¢ Let x € g![1’]. Then, g(x) € I’, and we find that 
@-x) =-eg(x)eEr'. 
Thus, —x € yg! [J]. 


To see that § is an ideal of R, take r € Randa € —™![I’]. This means that 
g(a) € I’. Then, 


g(ra) = p(r)g(a) € I’ 


since S’ is an ideal. Thus, ra € g~![I’]. Similarly, ar € g7'[/']. 


@ EXAMPLE 7.3.9 


The function g : Z > Z/6Z defined by y(n) = n+ 6Z is an onto homomor- 
phism. (See Exercise 7.) The image of 2Z under this map is 


g[2Z] = {n+ 6Z:n€2Z} = {0+62,2+62,4+6Z}. 
The pre-image of J = {0+ 6Z,3+6Z} is 
oT] ={n€Z: 4k € Zn = 6k Vn =3 4+ 6k)} = 3Z. 
Notice that both g[2Z] and go! [I] are domains of ideals of their respective rings. 


Take a ring ® and let R be its domain such that |R| > 2. Let 0 be the additive 
identity of R. We specify two functions. First, define z : RX Rx R > RX Rby 
(x,y, Z) = (x,y). This is a function similar to that found in Example 4.5.16. Notice 
that ran(z) = R x R, which means that z is onto, and 


A= {(x,y, Z) : a(x, y, Z) = (0,0)} = {(0,0, z): z € R}. 


Because |A| > 1, we conclude that z is not one-to-one. Second, define the function 
w:RxR—- RX RX R50 that w(x, y) = (x, y, 0). Compare y with Example 4.5.7. 


Section 7.3 HOMOMORPHISMS 379 


This function is not onto because ran(yw) = Rx R x {0}, but it does appear to be 
one-to-one because 


B= {(x, y): wx, y) = 0,0,0)} = {0, 0)}. 


The sets A and B that contain the elements of the domain that are mapped to the identity 
of the range appear important to determining whether a function is one-to-one. For this 
reason, such sets are named. 


@ DEFINITION 7.3.10 


Let R and R’ be rings. Let g : R > R’ bea function. The kernel of ¢ is 


ker(g) = {x € R: 9(x) = 0'}. 


Notice that Definition 7.3.10 implies that ker(g) = gy —'[{0'}]. Therefore, ker(@) is 
an ideal of ® (Theorem 7.3.8). Similarly, because ran(g) = g[R], we conclude that 
ran(¢~) is an ideal of R’. 


@ EXAMPLE 7.3.11 
Let gp: Z > Zs defined by y(n) = [n]5. 


¢ To find the kernel, assume y(n) = [O]5. By the assumption, [n]; = [0]5. 
So, n € [0]5, which means 5 | n. Hence, 


ker(g) € 5Z. 
Since the steps are reversible, ker(g) = 5Z. 


¢ Because ran(g) = {[n]; : n € Z} encompasses all congruence classes 
modulo 5, g is onto and ran(~) = Zs. 


We now prove that the kernel does provide a test for whether a ring homomorphism 
is an injection. Since any ring homomorphism @ : R — R’ maps the additive identity 
of ® to the additive identity of R’, it is always the case that 0 € ker(g). Therefore, to 
show that ker(g) = {0}, we only need to prove ker(@) € {0}. 


M@ THEOREM 7.3.12 


Let ® and R’ be rings such that 0 is the additive identity of R. Suppose that the 
function g : R —> R’ is aring homomorphism. Then, ¢ is one-to-one if and 
only if ker(g) = {0}. 


PROOF 
¢ Suppose that g is one-to-one. Take x € ker(g). This means that g(x) = 0’, 
where 0’ is the additive identity of R’. Since g is an injection, x = 0. This 
implies that ker(g) = {0}. 


380 Chapter 7 MODELS 


¢ Now let ker(g) = {0} and assume g(x,) = ~(x>), where x,;,x, € dom(R). We 
then have g(x ,) — p(x») = 0. Since @ is a homomorphism, we have 


AX, — Xz) = P(X) — P(X2) 


by Theorem 7.3.6. Hence, x; — x, € ker(@), which means x, — x, = 0 by 
hypothesis. Therefore, x, = x>. 


A similar proof can be used to demonstrate that same result for group homomorphisms. 


M@ THEOREM 7.3.13 


Let G and @’ be groups such that ¢€ is the additive identity of G. Suppose that 
the function g : G > @’ is a group homomorphism. Then, @ is one-to-one if 
and only if ker(g) = {e}. 


Isomorphisms 


Define w : Z5XZ_ > Z6XZs by w([al]s, [ble) = (Lbl¢, [a]5). Let + be coordinatewise 
addition on Zs x Z, and +’ be coordinatewise addition on Z, x Z;. We prove that this 
function preserves addition and is a bijection. 


e Let a,b,c,d € Z. Then, 


w((La]5.[bls) + (els. [a]6)) = wlals + [els . [b]6 + [4 ]6) 
=y(la+c]5,[b+d]¢) 
= ((b+ d]¢,[a+tc]s) 
= ([b]o + [do [als + [e]s) 
= ([ble. [als) +’ ([d]g . [els) 
= w(Lals.[b]g) +’ w(lels . [d]o). 
¢ Let ({a]5,[b]g) € ker(y). In other words, ([b]¢,[a]5) = ([O]¢.[O]5). Hence, 


5 | a and 6 | 6, which implies that [a]; = [0]; and [5], = [0], (Example 4.2.6). 
Thus, y is an injection by Theorem 7.3.13. 


¢ To see that y is onto, take ([c]¢ ,[d]5) € Z_ x Zs. Then, 


w(Ld]5.[cele) = (ele [d]s). 


The function y generalizes to the next definition. Note the similarity between this 
definition and that of an order isomorphism (Definition 4.5.24). 


HM DEFINITION 7.3.14 


Let 2& and 8 be S-structures. An isomorphism 2 — % is a homomorphism 
2 — B that is a bijection. An isomorphism 2 — 2 is called an automorphism. 
If there is an isomorphism ¢ : 2% — %, the S-structures are isomorphic and we 
write U = B. 


Section 7.3 HOMOMORPHISMS 381 


If @ = (Z5x Ze, ((O]5 , [O]g), +) and B = (Z_x Zs, ([O]¢ , [0]5), +’), then y shows that 
As B. 

Like order isomorphisms (Example 4.5.28), there are three basic isomorphism re- 
sults. The proof of this is left to Exercise 8. 


M@ THEOREM 7.3.15 


Let 2, B, and © be S-structures. 
ae es 
e If WU = B, then B x A. 


e If AU > Band Be G, then A&C. 


Mi EXAMPLE 7.3.16 
Let S = {0,<,-}. Let & = (7@, a), where 
a(0)(0) = 0 and a(0)(1) = 0, 
and for all f, g € 7a, 

(f,8) € a(<) if and only if f() < gO) V Lf) = gO) A fC) < g()I, 
that is, < is interpreted as a lexicographical order (compare Exercise 4.3.16), and 
a(-) is function multiplication. 

This means that a(-)(f, g)(x) = f(x)g(x). Also, let 8 = (@ x @w x {0}, b), where 
6(0) = (0,0, 0), 

and for all m,n, m’,n’ € o, 
((m, n, 0), (m', n’,0)) € 6(<) if and only ifn <n’ V[n=n' Am <M’, 
and 
6(-) is coordinate-wise multiplication. 


That is, 
b(-)((m, n, 0), (m’,n’,0)) = (mm’, nn, 0). 


Show that 2f = 8 by showing that g : 2 — B defined by 
pw) = (Ww), wO), 0) 
is an S-isomorphism. 


* 9(a(0)) = (a(0)(1), a(0)(0), 0) = (0, 0, 0) = b(0). 


382 Chapter 7 MODELS 


« Let f,g € 2a. Then, 


(f,8) € a(<) & f() < g0) VFO) = 8) A fC) < g()] 
= (f(1), £0), 9), (1), (0), 0)) € B(<) 
= (9(f), 0(8)) € B(<). 


¢ Let f,g € 2a. Then, 


pla(-)(F,8)) = PLO, FO)8O)), A, FDI) 
= (f()g(1), £0)g0), 9) 
= b-)(F(M), FO), 0), (g(1), 8(0), 0)) 
= b- (PY), P(g). 


Therefore, g is an homomorphism. That ¢ is a bijection is Exercise 9. Hence, 
A = B, and by Theorem 7.3.15, we have that B = 2. 


Because there is a bijection between Z; X Z, and Z¢ xX Zs, they have the same 
cardinality. Since the bijection that is typically chosen is also a homomorphism, the 
algebraic structure of the two rings are the same. Putting this together, we conclude 
that the two rings “look” the same as rings. The only difference is in the labeling of 
their elements. To formalize this notion, we make the next definition. 


HM DEFINITION 7.3.17 


¢ A group homomorphism that is a bijection is a group isomorphism. 


e A ring homomorphism that is a bijection is a ring isomorphism. 


We say that two rings R and R’ are isomorphic if there is an isomorphism y : R > R’. 
If two rings are isomorphic, write R Y R’. For example, we saw that 


L5xX 12% 26x Zs. 
The next example illustrates what we mean by “looking” the same. 
@ EXAMPLE 7.3.18 
Suppose that g : R > R’ is aring isomorphism. 
¢ If R is an integral domain, R’ is an integral domain. 
¢ If R is a division ring, R’ is a division ring. 
¢ If R isa field, R’ is a field. 


The proofs of the first two are left to Exercise 13. For example, suppose that R 
is a field. We show that ®’ is also a field. 


¢ Let 1 be unity from ®. Then, g(1) is unity from R’ (Exercise 14). 


Section 7.3 HOMOMORPHISMS 383 


¢ Let yo, y; € R’. Since ¢ is onto, there exists xp, x; € R so that p(x9) = yo 
and @(x,) = y;. Hence, 


Yo" Yi = PXo) "” GP) 
= P(Xq * xy) 
= P(x, - Xo) 
= 9(x)) ! (Xo) 
=) ! Yo- 
¢ Let yo € R’ \ {0’}. There exists xy) € R such that v(xq) = yo. Since ¢ is 


one-to-one, x9 # 0. Thus, because ® is a field, there exists x; € R \ {0} 
such that xg - x; = 1, so 


yo’ G(x1) = G(Xq) ” G(%1) = Y(Xp + x1) = GCA). 


The result of the next example is known as the fundamental homomorphism theo- 
rem. 


M@ EXAMPLE 7.3.19 


Let R and ®’ be rings. Let g : R > R’ be a surjective ring homomorphism. 
Define y : R/ker(y) > R’ by 


w(a + ker(g)) = g(a). 


To prove that y is well-defined, let a,b € R such that a+ ker(g) = b + ker(@). 
This implies that a — b € ker(@), so 


0 = g(a—b) = g(a) -' o(6). 
That is, p(a) = e(b). Now to show that y is a ring homomorphism. 
e Take a,b € R. Then, 
w(a+b+ker(g)) = g(at b) 


= o(a) +’ pb) 
= w(a+ker()) +’ w(b + ker(9)), 


and 


w(a-b+ker(p)) = eG: b) 
= 9(a) “' pb) 
= (a+ ker(g)) ’ w(b + ker()). 


¢ To prove that y is onto, take b € R’. Since ¢ is onto, there exists a € R 
such that g(a) = b. Therefore, 


w(a + ker(@)) = o(a) = b. 


384 Chapter 7 MODELS 


e To prove that y is one-to-one, let a,b € R and assume that y(a) = y(d). 
Since y is a ring homomorphism, 
w(a— b) = y(a) —' y(b) = 0. 
Hence, a — b € ker(@), which implies that a + ker(g) = b+ ker(¢). 


Elementary Equivalence 


Let A be a first-order language with theory symbols S. Suppose that 2f = (A, a) and 
8 = (B,}) are S-structures and J is an interpretation of &f. Let gp : & — B be an 


isomorphism. Define 
I, : TERMS(A) > B 


by I) = (gy o I)(t) for all t € TERMS(A). We confirm that Ig is an interpretation of 


e If x is a variable symbol, I(x) = (g 0 I\(x) € B. 
e Let c be a constant symbol. Then, [,(c) = (g 0 I)(c) = g(a(c)) = B(c). 
¢ Let f be an n-ary function symbol and fo, t,,...,t,_; © TERMS(A). Then, 
Tg(f Cos th, +5 tn-1)) = (9 9 DPF (to, te ++ tn-1) 
= PaCS) (to), L(t), ++» » Ln) 
= b(f)(@U (to), PU), --- PU) 
= bf) (to), Lo(t1), --- » Lg(tn-1))- 


An example will clarify the use of the interpretation J,. 


@ EXAMPLE 7.3.20 
Let Gp = (Z, 0, +) and G, = (2Z,0,+ | 2Z). Assume that J is an interpretation 
of G,. Define the group isomorphism f : Z > 2Z by f(n) = 2n. We consider 
the GR-formula y o x =e. 
AE Ax(yox =e) 1] > AE (yox =e) [1%] for some a € Z 
@ IMyo x) = I%(e) for some a € Z 
= f(y ox)) = fee) for some a € Z 
= FUL) + FUR) = FX) for some a € Z 
= fUR()) + f(@ = fUXe)) for some a € Z 
© f(1(y)) +b = f(12(e)) for some b € 2Z 
© f(12(y)) + f(x) = f(12(e)) for some b € 2Z 
© f(12(yox)) = f(12(e)) for some b € 2Z 
© BE yox =e [(I,)?] for some b € 2Z 
= BE Ax(yox =e) [Ty]. 


Section 7.3 HOMOMORPHISMS 385 


Using f, we see that the interpretation of 2{ gives rise to an interpretation of 8. 
For example, notice that if [(y) = 3, then f(/(y)) = 6. Hence, in 2, 


dx(y o x = e) is interpreted as 3 + x = 0 for some x € A, 
and the witness of 4x(y o x = e) is —3. In 8, 
dx(y o x =e) is interpreted as 6+ x = 0 for some x € B 


and the witness of 4x(y o x = e) is —6. 


@ LEMMA 7.3.21 


Let 2 = (A, a) and 8 = (B, 5) be S-structures. Assume that g : 2 > B is an 
isomorphism. Let J be an S-interpretation of 2. Then, 7%), = Ug oe) for all 
aca. 


PROOF 
We proceed by induction on terms, relying on Definitions 7.1.4 and 7.3.1. 


e Let x and y be variable symbols. Then, 
(1) g(x) = o(E(x)) = @(@) = ()2@), 
and if y # x, 
(12) g() = PUS(y) = EU) = 19) 2). 
e Ifc is aconstant symbol, then 


(2) 9(c) = eUS(c)) = PU(c)) = Ug)2C). 


¢ Let f be an n-ary function symbol and fo,f),...,¢,_; be S-terms. Take 
a €A. Then, 
oS Cos tissteste a) 
= PULLS (tos tis +++ stn) 
= pal f\Ti(to), L(t), Ln) 
= BF (PUL t0)), PURE)» + PUD) 
= WANT) olto)s ED glty)s +> TD (tn) 
= BCT 9) eto) (Hp oO) + (1g) oOMtn_w) 
=U )e Fost), 


The fifth equality follows by induction. Hi 


If g : & — 8B is an isomorphism, the structures look the same. Thus, formulas 
should be interpreted in 8 essentially in the same way that they are interpreted in 2. 
In other words, given an interpretation J of 2{, there should be an interpretation of B 
that is like J. This interpretation is I). 


386 Chapter 7 MODELS 


LEMMA 7.3.22 


Let 2f and B be S-structures with isomorphism g : & — B. Assume that J is 
an S-interpretation of 2. Then, for every S-formula p, 2 F p [J] if and only if 
BE pol. 


PROOF 
We proceed by induction on formulas, relying on Theorem 7.1.7. 


¢ Suppose that fy and t, are S-terms. Since @ is one-to-one, 


WE tp =t LT] > (to) = I()) 
= PU (to) = PU) 
@ Ig(to) = I(t) 
@ BE ty =t Lgl. 


¢ Let R be an n-ary relation symbol and fo, t,,...,f,_, be S-terms. Since @ 
is an isomorphism, 
WE Ro, 01,--- tp) WI) 
(to), £1), +++» LGn-1)) € LCR) 
= (PT (to), PUA), « PU (tn-1) © PT (RY) 
> g(to), Lolth), ---L@(tr_1)) € Tg(R) 
> BE R(t, t1,---5t-1) Wel. 


e Assume that q is an S-formula. Then, 
Ak ag eAK GUS Bek q[I,] oe BEG yl, 
where the middle equivalence holds by induction. 


¢ Let g and r be S-formulas. Suppose that 20 F q > r [I]. This means that 
2 E q [1] implies & EF r [J]. Assume BF q [L. |. By induction, we 
have that 2 F q [I], whence Y& F r [I]. Again, by induicdon: Ber gl. 
Therefore, 8 F q > r [I,]. The converse is proved similarly. 


e Let q be an S-formula. Then, by Lemma 7.3.21 and since @ is an isomor- 
phism, 


WE Axq [I] = WE q [I] for some a € A 


@2BEq (5 ]forsomea eA 
@=B8FqIU, 7] forsomeae A 
@=BEq Uy )?] for some b € B 
= BF Axq UU, |. 


UT 
[ 
[ 
[ 


Section 7.3 HOMOMORPHISMS 387 


Let ©, = (2Z,0,+) and G = (3Z,0,+). These isomorphic groups are infinite 
and cyclic, each with exactly two generators. The group @, is generated by 2 and 
—2, and G, is generated by 3 and —3. When Lemma 7.3.22 is restricted to sentences, 
we conclude that isomorphic structures model the same sentences. Hence, the GR- 
sentences satisfied by G, are exactly those GR-sentences satisfied by G,. For example, 


G, FE VxVy(x oy =yox), 


and 

G, F VxVy(x o y= yox). 
Also, 

@, F AxVydz(y = x 0 2), 
and 


@, F AxVydz(y = x 0 2). 
Therefore, we make the next definition and follow it immediately with the theorem that 
follows from Lemma 7.3.22 and Theorem 7.1.31. 
H@ DEFINITION 7.3.23 


The S-structures 2f and 8 are elementary equivalent (denoted by %& = B) if for 
all S-sentences p, 2( F p if and only if BE p. 


@ THEOREM 7.3.24 

For all S-structures 2f and %, if 2¢ = B, then 2 = B. 

Theorem 7.3.24 implies that if structures are isomorphic, there is no first-order sen- 
tence that can be used to distinguish between the two. That is, there is no sentence p 


such that 2 F p but B F p when YX = B. Conversely, if there is a sentence that is 
satisfied by one structure but not the other, the structures cannot be isomorphic. 


M@ EXAMPLE 7.3.25 


Let 2 and 8 be ST-structures such that 2 = (wt, €) and B = (w, €). Observe 
that 
WE AxVy(_y ExVy=x) 


but 
BE AxVYWy ExVy=x). 


Therefore, by Theorem 7.3.24, we have that 2 is not isomorphic to 8. 


M@ EXAMPLE 7.3.26 


Let 2 = (A, a) and 8 = (B,}b) be { R}-structures, where R is a binary relation 
symbol. Let A = {dg,a,} with ag # a,. Assume that the structures are not 
isomorphic yet 2f = 8B. Define 


p:= Ax(x = x), 


388 Chapter 7 MODELS 


q := Vxdy(x # y), 
and 
ri=VxVW2(x FyAx#zZ7>y=2Z). 
Then, 2 F p because A is nonempty, 2f F q because A has at least two elements, 
and 2{ F r because A has at most two elements. Hence, 


WEpAAr. 


Since & = B, we have that BF pAqg Ar, so |B| = 2. Write B = {bp, b,} with 
bo # 6,. This implies that there are only two bijections A > B. Call them @p 
and @,, and write 

Po = {Go bo), (41, 51)} 
and 

P1 = {(4o, 51), (a1, bo) }- 
By assumption, neither of these functions are {R}-homomorphisms, so they fail 
to preserve R. Suppose that (ap, a,;) € a(R) is the particular element that causes 
the second condition of Definition 7.3.1 to fail. Therefore, 


WE Axdylx 4 yA R(x, y)]. 


Since @o does not preserve R, (bo, b,) ¢ 6(R), and since gy, does not preserve 
R, (6;, bo) € BCR). Hence, 


Bi Axdylx # yA RO, y)I, 


which is impossible because 2 = B, so gp or g, must be an { R}-isomorphism. 
The argument generalizes for all structures with a domain of cardinality 2 (Exer- 
cise 17) and then to all structures with a finite domain (Exercise 18). Therefore, 
the converse of Theorem 7.3.24 holds, provided that the domains of the structures 
are finite. 


Elementary Substructures 


Let 2¢ = (A, a) and 8 = (B, b) be S-structures such that 2f C 8. Let I be an interpre- 
tation of 2. By Definition 7.1.4, we have the following: 


¢ For all variable symbols x, I(x) € A C B. 
¢ For all constant symbols c, I(c)€ AC B. 


¢ For all n-ary function symbols f and S-terms fo, f),...,¢ 
fork =0,1,...,n —1 and a(f) = b(f) [ A”, 


T(f (tos tys + stn) = AAA (to), L(t)...» Lt) 
= B(f)U (to), Hy), «+» Ltn): 


because I(t,) € A 


n-1? 


This implies that J is also an interpretation of 8. Moreover, when p is a quantifier-free 
S-formula, the interpretation of p in 8 requires no element of B \ A. This implies the 
next theorem. Its proof is left to Exercise 21. 


Section 7.3 HOMOMORPHISMS 389 


M@ THEOREM 7.3.27 


Let 2f and B be S-structures such that 2 C B. For all S-interpretations I of 2 
and all quantifier-free S-formulas p, 2 F p [J] if and only if BF p [/]. 


Next, consider 8 F Axp [J]. This means that B EF p [1°] for some b € B. If bis 
also in A, then 2% F p [I *), but this conclusion is not guaranteed if b € B \ A. For 
example, let 240 = (Z, <) and B = (Q, <) be { R}-structures, where R is a binary relation 
symbol. Since Z C Q, we have that 2f is a substructure of 8. However, because of 
differences between Q and Z, 


BE Vx )Vx.d (x1 < y < xp) 
but 
WK Vx Vx A y(x, < y < Xp). 


Hence, in order for 2 F p [J] if and only if B F p [J] for even quantified formulas p, 
a stronger condition is required. 


@ DEFINITION 7.3.28 


2 is an elementary substructure of B (written as 2 < B) if 24 C B and for all 
interpretations J of 2f and S-formulas p, 


QE p [L] if and only if BF p [J]. 
If 2 < B, then B is an elementary extension of 2. 


Suppose that 2f and 8 are S-structures such that 2f is an elementary substructure of 
%. Let p be an S-sentence. Since p is also an S-formula, 2f F p if and only if 8 F p 
by Definition 7.3.28. Therefore, 2f and 8 are elementary equivalent, proving the next 
result. 


@ THEOREM 7.3.29 
Let 2 and B be S-structures. If 2 < B, then & = B. 


However, the converse of Theorem 7.3.29 does not hold. It is possible for & C B 
and 2 = 8 yet Yf not be an elementary substructure of 8. If this is to be the case, 
there must be a formula p with at least one free variable that is satisfied by one of the 
structures but not the other. 


EXAMPLE 7.3.30 


Let 2& = ((0, 1],a) and B = ((0,2], b) be the { R}-structures of Example 7.2.2, 
where a(R) is standard < on [0, 1] and 6(R) is standard < on [0,2]. Recall that 
Y is a substructure of B. Since f defined by f(x) = 2x is an { R}-isomorphism 
[0, 1] — [0, 2], we conclude that 2f = 8 (Theorem 7.3.24). However, define the 
formula 


P(x) := AyR(x, y) 


390 Chapter 7 MODELS 


and let J be an interpretation such that I(x) = 1. Then, 


WHE p(x) LT] 
and 
BE p(x) 7]. 


Therefore, 2{ is not an elementary substructure of 8. 


Because of Theorem 7.3.27, proving that one structure is an elementary substructure 
of another reduces to a check of one particular condition. This condition is found in 
the next theorem, which is due to Tarski and Vaught (1957). 


MH THEOREM 7.3.31 [Tarski-Vaught] 


Let 2€ and 8 be S-structures such that 2 C 8. Let A be the domain of 2. The 
following are equivalent. 


e 2 is an elementary substructure of 8B. 
¢ For every S-formula p and S-interpretation J of 2, 
if BF Axp [1] then B F p [I®] for some a € A. (7.19) 


PROOF 
Suppose that 2f < B. Let p be an S-formula and J an S-interpretation of 2f. Let 
8B E Axp [I]. By hypothesis, we have that 2 F dxp [I]. Thus, 2 F p [1%] for 
some a € A, so again by hypothesis, B F p [J2] for some a € A. 
Conversely, let p be an S-formula and J an S-interpretation of 2f. We proceed 
by induction on formulas to show that 2 < 8. 


e If pistyo =t, or R(to, ty, ...,t,—1) for S-terms fo, t), ...,f,—1, then p has no 
quantifiers, and the result follows by Theorem 7.3.27. 


e Let pbe 7q. Then, 

WE pT] eo WE ->q [I] 
our al 
= Bal 
= BE -gq [J] 
2 BE p[l). 


T) 
I] 


e Suppose p is g > r. Assume that 20 q > r [I] and let B F q [J]. Then, 
2 E q [I] by induction, so 2& F r [J]. Again, by induction, 8 F r [J]. 
Therefore, 8 F r > q [I]. The converse is proved similarly. 


¢ Now let p be Axq. First, because 2 C B, 


WE Axq WU] > AE q [2] for some a € A 
=> BF q [I%] for some a € B 
=> BE Axq [J]. 


Section 7.3 HOMOMORPHISMS 391 


Second, by (7.19), 


BE Axq YT] > BE q[I%] forsomeae A 
> ME q([I°] forsomeae A 


> AXE Axq [J]. 0 


I) 
tba] 


EXAMPLE 7.3.32 


Prove that 2f = (Q, <) is an elementary substructure of 8 = (R, <). To do this, 
follow the test given by Theorem 7.3.31. Let p be an S-formula and I be an 
S-interpretation of 2f. Assume that 8 F Axp [J]. This implies that 


BE p [1°] forsome bE R. 
Take ag, a,,a € Q such that b, a € (ag, a,) and define the functions 
fo : (—00, ag] > (—00, dg], 
Ff + (aq, 4) > (ao; 5), 
to : [b, a;) =F [a, a1), 


f3 + [a1, 00) > [a1, 00), 


so that fo is the identity on (—co, ag], f3 is the identity on [a),0o), and f; and 
ff are order isomorphisms (Exercise 22). Let f = fy U f; U fo U fz. Then, 
f is an order isomorphism R — R such that f(b) = a, f(aj) = f(ao), and 
f(a,) = f(a,). Hence, f is an automorphism (Exercise 23). Therefore, by 
Lemmas 7.3.21 and 7.3.22, 


BE pl] BE pl?) eo BE pl, hI, 
which implies that 
BE p [1°] for some b € Q. 
Thus, 2f < 8, which also implies that 2 = 8 (Theorem 7.3.29). 


Notice that this example implies that there is no first-order sentence that can dis- 
tinguish between (Q, <) and (IR, <) even though they are not isomorphic. Also, this 
example shows that the converse of Theorem 7.3.24 is false if the domains of the struc- 
tures are infinite. 

Certainly, for all S-structures 2, 8, and ©, we have the following by Exercise 19: 


e MW= Az 
e If UW = B, then B= A. 
e If 2 = Band B=G, then A=G. 


Similar to this and Theorem 7.2.4, we have the following result for elementary sub- 
structures. 


392 Chapter 7 MODELS 


M@ THEOREM 7.3.33 


Let 2, B, and © be S-structures. 
© UK A. 
e If 2 < B and B < G, then A < G. 
- If Wi < C, BX G, and A C B, then A < B. 


PROOF 
We prove the second part and leave the others to Exercise 25. Let 24 < 8 and 
8% < ©. This first means that 2& C Band B C GC. Thus, by Theorem 7.2.4, 
2 C ©. Let J be an interpretation of 2. The proof is completed for quantifier- 
free formulas using Theorem 7.3.27 (Exercise 24), so let p be an S-formula with 
a quantifier. 


¢ Suppose 2% F Axp [I]. This implies that 2 F p [I?] for some a € A. 
Then, 8 F p [I?] from which it follows that © F p [I?]. Since A € C, we 
conclude that © F Axp [J]. 


¢ Conversely, let © F dxp [J]. By Theorem 7.3.31, there exists b € B such 
that © F p [1°]. By hypothesis, 8 F p [1°], so B FE Axp [1]. Again, 
by Theorem 7.3.31, there exists a € A such that B F p [J%]. From this 
follows that 2{ F p [12], whence 2 F Axp [J]. Ml 


Exercises 


1. Let 2f, B, and © be S-structures such that 2 C B. Let g : B > C be a homomor- 
phism. Prove that g[dom(2f)] is the domain of a substructure of ©. 


2. Prove that the following are group homomorphisms. Assume in each instance that 
o is interpreted as the standard addition on the set. 

(a) 9: ZxZ— Z where ¢e(a, b) =a 

(b) Q: Z1> > Z6 where ~([a]1>) => La]6 

(c) w:ZxZ— Mb, »(R) where 


wa, b)= f i 


3. Prove that the zero map is a group homomorphism. 


4. Let 3 = (Z, +) and 8, = (Z,,, +) with n € Z*. Prove that g : Z > Z,, defined by 
g(x) = [x], is aring homomorphism 3 > 3,,. 
5. Let © =(C,0+0i,+4+,-). Define wy: C — C by w(a+ bi) =a— bi foralla,beER. 
Prove the following. 

(a) yw is aring homomorphism © > @. 

(b) yw is one-to-one and onto. 


Section 7.3 HOMOMORPHISMS 393 


6. Prove the first part of Theorem 7.3.8. 


7. Let R = (R,0,+4,-) be a ring and & be an ideal of R with domain J. Let the 
function g : R > R/T be defined as g(a) = a+J forallae R. 

(a) Prove that g is ahomomorphism. 

(b) Show that ker(g) = I. 

(c) Prove that ¢ is onto. 


8. Prove Theorem 7.3.15. 
9. Prove that the function g of Example 7.3.16 is a bijection. 


10. Let 2f and 8 be S-structures. Write 20 < % if there exists a one-to-one homomor- 
phism g : 2% —> %. Prove the following. 

(a) WS A 

(b) If 2 < B and B x G, then W =< CG. 

(c) Itis possible for 2 < Band B X © yet 2 and B are different S-structures. 


11. Let 2 and 2{’ be { R}-structures with equal domains that are finite, where R is a 
binary relation symbol. Assume that 


ME X has a least element with respect to R 


and 
2’ E X has a least element with respect to R 


for every X C dom(2). Prove that 2 & 2’. 


12. Assume that S is finite and that the domain of the S-structure 2{ is finite. Find a 
S-sentence p such that 8 F p if and only if 8B = , for all S-structures B . 


13. Suppose that g : R > R’ is aring isomorphism as in Example 7.3.18. Prove. 
(a) If ® is an integral domain, R’ is an integral domain. 
(b) If R isa division ring, R’ is a division ring. 


14. Let g: R > R’ be aring isomorphism. Prove that | is unity from ® if and only 
if o(1) is unity from R’. 


15. Prove or show false the given elementary equivalences. 
(a) (R,0, +) = (C, 0, +). 
(b) (Z,0, +) = (Q,0, +). 
(c) (24,5) =(Q,9). 
(d) (@, <) = (Z*,S). 
) (9) =(Z,9). 


16. Let A and B be sets and assume that S = @. Prove that (A) = (B). 


17. Generalize the argument of Example 7.3.26 to prove that 21 = 8 implies that 
A = B if the cardinality of the domain of 2 equals 2. 


18. Generalize the argument of Example 7.3.26 to prove that 20. = 8 implies that 
A = B if the domain of Y is finite. 


394 Chapter 7 MODELS 


19. Let 2f, 8, and © be S-structures. Prove the following. 
(a) W= 2°. 
(b) If 2 = B, then B = 2. 
(c) If &@ = Band B=G, then W=C. 


20. Let 2f and 8 be elementary equivalent S-structures. Prove that there exists an 
S-structure © such that 2 < © and B < ©. 


21. Prove Theorem 7.3.27. 
22. Find the order isomorphisms f, and f> in Example 7.3.32. 
23. Prove that the function f from Example 7.3.32 is a {<}-isomorphism. 


24. Assume that for all S-structures 2, B, and G, if 2 < B and B < G, then A F pif 
and only if © F p for all quantifier-free p. 


25. Prove the first and third parts of Theorem 7.3.33. 


26. Let F = {W, : y © «} be a chain of S-structures for some cardinal « (Exer- 
cise 7.2.3). Assume that &, < %&, when a € f € x, making F an elementary chain. 
Prove that %, < U,<, U, for alla € x. 


27. Find a set of subject symbols S and a chain of S-structures {2,, : n € @} such that 
A; = A; for all i, j € w but Ay # Ynew An: 


7.4 THE THREE PROPERTIES REVISITED 


This section is an extension of Section 1.5 to first-order logic. First, we define what it 
means to be consistent with the ultimate goal to show that any consistent system has a 
model. Next, we show that first-order logic is sound in that every sentence that can be 
proved is true, provided we have the correct understanding of what it means to prove 
and what it means to be true. Lastly, we show that first-order logic is complete in that 
every true sentence can be proved. 


Consistency 


Consider the following propositions representing Euclid’s axioms for his geometry (Eu- 
clid 1925, I Postulates). 


¢ Eul. A line can be drawn through two distinct points. 
¢ Eu2. A line segment can be drawn between two distinct points. 


e Eu3. A circle can be drawn given any center and radius. 


Eué4. All right angles are congruent to each another. 


Eu5S. If a line falling on two straight lines make the interior angles on the same 
side less than two right angles, the two lines intersect on the side on which the 
angles are less than two right angles. 


Section 7.4 THE THREE PROPERTIES REVISITED 395 


There are three sets of proposition here. First, is the list itself, 
@ = {Eul, Eu2, Eu3, Eu4, Eu5}. 
Second, is the set of consequences (Definition 7.1.20) of @, 
6, ={p: SF p}. 
Third, is the set of propositions provable (Definition 1.2.13) from @, 
6, ={p: EF p}. 


For Euclidean geometry, €, = &). This means that both @, and @, can be said to 
describe all of the propositions that are in the geometry. It is the job of the geometer to 
discover what those propositions are. 

We want to similarly analyze first-order logic. Given the rules of logic and a set 
of theory symbols, we want the set of consequences to be equal to the set of provable 
sentences. One way to have this happen is for a contradiction to follow from the axioms 
(1.2.8). Then, by Theorem 1.5.2, every propositional form will have a proof and, in 
turn, will also be a consequence, but we do not want such a system. Therefore, for all of 
this to work effectively, it is a requirement that first-order logic is without contradiction. 
To deal with this concept, we start with a definition. 


@ DEFINITION 7.4.1 
An S-theory is a set of S-sentences. 


The set of all sentences provable in first-order logic given a set of theory symbols S is 
an example of an S-theory. The theories that we study should have a familiar property 
(compare Definition 1.5.1). 


@ DEFINITION 7.4.2 


An S-theory 7 is consistent [denoted by Con(F)] if 7 K q A 7g for every 
S-sentence g. Otherwise, 7 is inconsistent. 


We next generalize Theorem 1.5.2 to first-order logic. Its proof is left to Exercise 2. 


@ THEOREM 7.4.3 
Let 7 be an S-theory. The following are equivalent. 
¢ J is consistent. 
¢ Every finite subset of 7 is consistent. 


e There is an S-sentence that is not provable from 7. 


As a consequence of Theorem 7.4.3, a theory 7 is inconsistent if either it has a finite 
subset that is inconsistent or it is able to prove all sentences. This identifies the major 


396 Chapter 7 MODELS 


weakness of an inconsistent theory. Not only can a contradiction be proved, but any 
sentence can also be proved. Such a system is certainly worthless. 

If a theory is consistent, it can be shown to be a subset of a theory that is the greatest 
possible consistent theory. We generalize Definition 1.5.3 to name this theory. 


HM DEFINITION 7.4.4 


The S-theory 7 is maximally consistent if 7 is consistent and Con(Z U {p}) 
implies p € J for all S-sentences p. 


The next lemma gives some basic properties of maximally consistent theories. 
@ LEMMA 7.4.5 
Let 7 be a maximally consistent S-theory. Let p and q be S-sentences. 
elIfF tp, thenpe TZ. 
eIffp,thenpeZ. 
© pET or7apeEZ, but not both. 
eIfpoqgpEeZ,thngeZs. 


PROOF 
e Let J - p. Suppose that F U {p} & q A 7g for some S-sentence q. This 
means that there exists S-sentences pg, Pj, ... 5 P,—1 Such that 


DP, Po» P1> see »Pn—-1>9 A 71q 


is a proof of gA7q from F U {p}. Since F proves p, there are S-sentences 
o> 1+ +++ >4m— Such that 


qo> q1> tee >AIm-1>P 


is a proof of p from Y. Therefore, 


90> V1> «++ > Im—19 P> Poo «++ > Pn—-19 9 A 71q 


is a proof of g A ~q from 7, a contradiction. This implies that 7 U {p} is 
consistent, so since F is maximally consistent, pE J. 


e Ift p, then J + p,so p € J by the first property. 


e Since 7 is consistent, Con(Z U {p}) or Con(Z U {7p}). Because F is 
maximally consistent, p € F or ap € J, but not both because Con(Z ). 


e Let p > q and p be members of Y. By MP, we have that 7 | q, so again 
by the first part,qEe 7.0 


Given a consistent theory, the construction of the maximally consistent theory that 
contains it requires the use of a chain. 


Section 7.4 THE THREE PROPERTIES REVISITED 397 


LEMMA 7.4.6 


If (@, C) is a chain of consistent S-theories, |) @ is consistent. 


PROOF 
Let @ be a chain of consistent sets of S-sentences with respect to €. Suppose 
LJ) @ + qA7@ for some sentence q. Hence, there exists po, Pj, ---.Pn-1 €E US 
such that 


Po» Pio +++» Py-1 FH QA 79. 


This implies that for each i = 0,1,..., — 1, there exists C; € @ such that 
Dp; € C;. Since {C; :i =0,1,...,1—1} is a finite chain, arrange them so that C,, 
is the greatest element with respect to C. This gives 


{Po»P1> tee »Py—} Cc Cy: 


which implies that C, - q A nq, contradicting the consistency of C,,. 


The next result proves the existence of a greatest consistent theory by generalizing 
Theorem 1.5.4 to an arbitrary set of sentences. 


M@ THEOREM 7.4.7 [Lindenbaum] 


Every consistent S-theory is a subset of a maximally consistent S-theory. 


PROOF 
Let Z be aconsistent set of S-sentences. Define 


@={E: 7 C Eand E is aconsistent set of S-formulas}. 


Note that # # @ since J € A. Let G be achain in &. Then, |) @ € & because 
-FCUS. 


¢ LU @ is consistent because each element of the chain @ is consistent by 
Lemma 7.4.6. 


We conclude by Zorn’s lemma (5.1.13) that there is a greatest element M € & 
with respect to C. By definition,  € M and Con(M). Also, let p be a formula 
such that Con(M U {p}). Then, M U {p} € &, but since M is maximal, we 
conclude that M U {p} = M. Thus, p € M, showing that M is maximally 
consistent. Hi 


Soundness 


We previously defined a logic to be sound if every theorem is a tautology and complete 
if every tautology is a theorem (Definition 1.5.5). We now give the corresponding 
definition for first-order logic (Figure 7.3). 


398 Chapter 7 MODELS 


Sound 


The sentence The sentence 


is a theorem. is valid. 


Complete 


Figure 7.3. Sound and complete logics. 


HM DEFINITION 7.4.8 


¢ A logic is sound if every theorem is valid. 


¢ A logic is complete if every valid sentence is a theorem. 


A propositional form p is a tautology if v(p) = T for every valuation v (Definition 1.1.14). 
This means that any interpretation of p will always yield a true proposition. This, in 
turn, implies that p will hold in every model. That is, 


tautologies are valid. 


Therefore, it is evident that Definition 7.4.8 is a generalization of Definition 1.5.5 to 
first-order logic. 
As in Section 1.5, we begin with soundness. 


HM THEOREM 7.4.9 [Soundness] 


Let F be a set of S-sentences. For any S-sentence p, if 7 + p, then 7 EF p. 


PROOF 
Using strong induction, we prove that for all k € Z*, 


if there exists a proof of p from F 


consisting of k sentences, then F F p. Cc) 


In the case that n = 1, we see that F p. This implies that p is an axiom or p € F 
(Definition 1.2.13), so we have that Z F p. 

Now suppose that n > | and assume that (7.20) holds for all k < n. This 
means that there exists sentences po, P;,.-.,P,—, provable from 7 such that 


PO> P1> +++ >Pn—-1>P 


is a proof of p. Let 2€ be any model of 7. We have three cases to consider 
(Theorem 1.4.2). 


¢ p © J, which implies that 2 F p. 


Section 7.4 THE THREE PROPERTIES REVISITED 399 


¢ pis derived using MP. This implies there exists S-sentences r and p such 
thatr,r > p € {po, Pj, ---»Pn—1,P}. Since a sentence of a proof is proved 
from previous sentences of the proof, by induction, &F r— pand Wer. 
Therefore, 2f F p. 


¢ pis logically equivalent to p, for some i = 0,1, ...,n — 1. This means that 
p © p; isa tautology, so 2 F p © p;. By Theorem 7.1.7, we conclude that 
2 F pif and only if 2 F p,, but by induction, 2 F p;, so UF p. H 


If 7 = S, we can apply Theorem 7.4.9 to conclude the corollary. 
COROLLARY 7.4.10 

First-order logic is sound. 
As with Corollary 1.5.11 for propositional logic, we can also prove the next result. 


M@ COROLLARY 7.4.11 


First-order logic is consistent. 


Completeness 


It is now time to prove the completeness of first-order predicate logic. This result is 
due to Kurt Gédel (1929) but the proof given here is due to Leon Henkin (1949). As 
opposed to the soundness theorem (7.4.9), the proof is rather involved. We begin with 
a definition (compare with EI, Theorem 2.3.14). 


Hi DEFINITION 7.4.12 


Let 7 be a set of S-sentences and C a set of constant symbols of S. Define C to 
be a witness set for 7 if for all S-formulas p = p(x), there exists c € C such that 


TK Axp = pe. 
x 


For example, let 7 = {x +1=0,x+(14+1) = 0,(x+ y) +1 = 1} bea set of 
NT-sentences. Extend NT to S = NT UC, where C = {—1, —2}. Then, C is a witness 
set for F if 


F Lae AOS eh 
x 


and 


FAK +U+N=0 > HOD ==. 


Since (x + y) + 1 = | has two free variables, the witness set does not apply to it. 
Recall that for a set of theory symbols S, the notation L(S) refers to the set of all 
S-formulas (Definition 2.1.13). 


400 Chapter 7 MODELS 


LEMMA 7.4.13 


Let 7 be a consistent set of S-sentences. Let C be a set of constant symbols 
not in S such that |C| = |L(S)|. Then, 7 can be extended to a consistent set of 
(S U C)-sentences that has C as a witness set. 


PROOF 
Let « = |L(S)| and suppose that C = {c, : a € x} is a set of new constant 
symbols. Assume that cy # cg for alla € f € x. Since |L(SUC)| = xk, we 
identify the formulas of L(S U C) with at most one free variable as 


Pa = PalXq) 


with a € x. Now define a chain @ = {F, : y € x} anda set {d, : y € x} by 
transfinite recursion (Corollary 6.1.24) such that for all 6 € x, 


° Tp is consistent, 

* JF, isa(SUC)-theory, 

° \Fpat \ Fal =1, 

* dz is not among the symbols of the sentences in Fp. 


Let Yy = F and suppose that Y; and d; has been defined for all 6 € @ with the 
indicated properties. 


¢ Let a = 6 + 1. Choose dg from C such that dz is not among the symbols 
of the formulas in 7p U {pz}. This can be done since there are less than x 
theory symbols in Fg U {pz}. Now define 


d 
B 


In order to obtain a contradiction, suppose that 7, ; is inconsistent. Since 
Tp is consistent, it must be the case that 


Ty Fa (2x0 => w=) ; 
B 


This implies that 
dg 


By Com and Simp, 
d 


Since dg is not among the symbols of pg or Z, pit is arbitrary, so by UG, 


Section 7.4 THE THREE PROPERTIES REVISITED 401 


That is, 
Tp fF AX gPp- 


Using Simp and Conj, we conclude that 
Tp b AX gP p A AX pPp> 
contradicting the consistency of 7,. 


e If a is a limit ordinal, define 
Fe= Ip. 
pea 


which is consistent by Lemma 7.4.6. 


We claim that the (S U C)-theory 


U% 


PEK 


is the desired extension of 7. To see this, first note that this union is consistent by 
Lemma 7.4.6. Also, let p(x) be an S U C-formula with at most one free variable. 
This implies that p = p, and x = x, for some a € x. Therefore, 


a 
AXqPa = Par € Fats 
a 
SO 
on dy 
U FT, Axia > Pa | 


YEK a 


Let S be a set of theory symbols and C be a set of constants from S. Let F be a 
consistent S-theory. We want to define an S-structure that has a chance of being a model 
of Y. One of the issues that must be overcome is whether any pair of constants should 
be interpreted as being equal. If so, we want to view those constants as equivalent. To 
do this, define a relation ~ on C by 


Cy ~ ¢ if and only ifcg =e, EF (7.22) 
for all co, c, € C. This defines an equivalence relation under the right conditions. 


@ LEMMA 7.4.14 


Let C be a set of constants from S. If 7 is a maximally consistent S-theory, then 
~ is an equivalence relation on C. 


402 Chapter 7 MODELS 
PROOF 
Let Co. C1, C2 EC. 


e Since F U {cg = co} is consistent by E1 (Axioms 5.1.1), by the maximal 
consistency of 7, we have that cg = cy E FJ. 


¢ Suppose cy = c,; € FY. Then, FJ keg =c),80 F kc; = cg by E2. Thus, 
Cy =qEe7. 


° Letcg =c; € F andc; = c, € F. Hence, we have F + cg = c, and 
F - e, =¢y. Then, F + cg = cy by E3, which yields cg =c, CF. 


We define the domain of the desired structure to be C/~, the set of equivalence classes 
modulo ~ (Definition 4.2.10). Before we define the function a, we need some actual 
functions and relations on C/~. Use the following lemma to define them. 


@ LEMMA 7.4.15 


Let 7 be a maximally consistent S-theory and C be a set of constant symbols 


from S. Let fo, t1,---5ty—1> ti de Lae fog be S-terms such that 
f , / 
to oF to, ty ome ti o | Led ter 


e For every n-ary function symbol /, 
yt , 
f (to, ty, aes, sty) lad Fost, eee st 


e For every n-ary relation symbol R, 


hegh 


Ritostis---.trh-D ETF & R(t, janet 


/ om 
jt 4) Sa 


PROOF 
By (7.22) we have that 


= on =) 
peed Pherae oa 


Let f be an n-ary function symbol. Then, 
FAAS Cpl cette yas GC ydeiest, 9) 
by E4 (Axioms 5.1.1), so 


Sti sta) at Gteaat4le oF 
by Lemma 7.4.5, which implies that f (to, t,, .-.5t)—1) ~ fis oe ane ta) Next, 
let R be an n-ary relation symbol such that R(to,t),..-,t,-1) € ZF. This im- 
plies that 7 + R(to,ty,..-,t,-1), 80 we have that 7 + R(t), 1), --..0/_,) by ES. 


of? 
Again, by Lemma 7.4.5, R(t), Ae aa t’_) e7.u 


Section 7.4 THE THREE PROPERTIES REVISITED 403 


We now define the functions and relations on the domain C/~ with 7 being maximally 
consistent. 


Let f be an n-ary function symbol from S. Let cg, c),...,¢,_; € C. Define the 
n-ary function f. : C/~ > C/~ by 


fF (leg), [ey], --- [e,-1) = [el] & f(e9, ¢1,--- Cp) =C ETF. (7.23) 


Since f_ is defined on a set of equivalence classes, we must confirm that f_ is 
well-defined. Let do, d,,...,d,_, © C such that 


Co ™~ do, Cy om dy, vee Cy 1 ™~ dy-1- 
Then, [c] = [d] because by Lemma 7.4.15, 


(co, Cio--- 5Cy—1) ~ (do, di, Hasty dy-1)- 


Let R be an n-ary relation symbol from S and take cg,c,,...,¢,_; € C. Define 
the n-ary relation R. on C/~ by 


([eg], [ey], 3 re) [c,-1) E RR, > RCo, Ci, shares Ch-1) Ee TF. (7.24) 
We must check that the relation holds if different constant symbols are used to 
represent the classes [cg], [c;], ...,[¢,-,]. To do this, let dg,dj,...,d,_) € C 
such that cy ~ dg,c; ~ dy, ...,Cy_1 ~ dy_,. Then, we have that 
([do], [41], ..., [dni € RY 
since by Lemma 7.4.15, 


R(dos dq, ---.4,_ ET. 


The previous work makes the interpretation of the function and relation symbols of 
S obvious. However, what about the constant symbols? To find out, take a constant 
symbol c of S. If c € C, then c should be interpreted as [c], so suppose that c ¢ C. 
Because F 4x(x = c) and J is maximally consistent, 


dx(x=c) EF. 


At this point make the further assumption that C is a witness set of . Then, there 
exists d € C such that 


c=dEe¥’d. 


This implies that [c] = [d]. Therefore, for every c € S, there exists c_, € C such that 
[c] = [c_]. We can now define the structure. 


404 Chapter 7 MODELS 


HM DEFINITION 7.4.16 


Let F be a maximally consistent set of S-sentences with witness set C. Let ~ 
be the relation (7.22). Define 2{_, to be the S-structure with domain C/~ and 
function a such that 


e a(c) = [c.] for all constant symbols c € S, 
e a(R) = R_ for all n-ary relation symbols R € S, 


e a(f) = f~ for all n-ary function symbols f € S. 


Since we are investigating consistent sets of sentences, we do not need to involve a 
particular interpretation for the structures in the proofs of the following results (Theo- 
rem 7.1.31). Using Definition 7.4.16 will suffice. 


@ LEMMA 7.4.17 


Suppose that 7 is a maximally consistent set of S-sentences with C as a witness 
set. Then, 2 F ¢ = cif and only if f = c € J, for every constant symbol c € S 
and S-term f. 


PROOF 
Assume that c € S is a constant symbol and let ¢ be an S-term. Since 7 is a set 
of sentences, ¢ cannot have any free variables. 


¢ Let t = d for some constant symbol d from S. Then, 


WiEd=ce([djJ=[c]led~ced=cEe¥7. 


¢ Lett = f(to,t,,...,¢,-1) for some n-ary function symbol f and S-terms 
to,t),-..,t,—1 containing no free variables. Assume that 


WoE fto.ty,.-- sty) =e. 
Since FJ is maximally consistent, 
dx(t; =x)E TF 
for i = 0,1,...,2— 1. Since C is a witness set, there exists c; € C such 


that 


which is equivalent to 
free (7.25) 


Therefore, by induction, 


Hence, 
We SACRO a op) =¢. 


Section 7.4 THE THREE PROPERTIES REVISITED 


This implies that 


fr (¢o], ei]; --- + [en-1)) = [el 


so by (7.23), 
(Cor Cys e115 Cyp =H CET. 


Therefore, by Lemma 7.4.15 and (7.25), 
fot .-.tD=HcEeT. 


The converse is left to Exercise 4. Hi 


@ LEMMA 7.4.18 


405 


If F is aconsistent set of S-sentences with C as a witness set, then p € 7 if and 


only if 24¢., F p [I] for every S-sentence p. 


PROOF 


We can assume that 7 is maximally consistent (Exercise 5). Let C be a witness 


set of 7. We prove the theorem by induction on formulas. 


¢ Letty = t, € F for S-terms tg and t, with no free variables. Since 7 is 


maximally consistent, 
Ax(t9 =X) ETF 


and 
Ax(t; =x) EF. 


Because C is a witness set for 7, there exists c,d € C such that 
ty=ceT 


and 
th=deg. 


This implies that tg ~ c and ft; ~ d, so because fo ~ f, 


Therefore, & F c = d, but because WM F tp =candWM Et, =d 
(Lemma 7.4.17), U1 F tg = t,. The proof of the converse is Exercise 7(a). 


¢ Let R(to,t),...5t,-1) € F for some relation symbol R in S and S-terms 
to, t), -..,t,—1 With no free variables. As in the first part of this proof, there 
exist constants c; € C (i = 0,1,...,m — 1) such that t; ~ c;. Thus, by 
Lemma 7.4.15, R(co, Cy, .--,C,-1) € J, and then by the definition of R_, 


we have that 
([eo], [ey]... fe,-1]) € Rx. 


Therefore, 
Wo Rey, cy, --- 5 Cp—1) 


406 Chapter 7 MODELS 


so WF R(t, tf) ...,t,-1) [1] as in first part. The proof of the converse is 
Exercise 7(b). 


Let p be an S-sentence. Then, by induction and the maximal consistency 
of F 
apEF eSepETFOuA_FpsxA_F-p[/]. 


Let p and q be S-sentences. Assume p > q € J and A F p [I]. By 
induction, p € 7, so by Lemma7.4.5 and MP, q € J. Again by induction, 
WF q [1], so 8 _ F p > q [J]. See Exercise 7(c) for the proof of the 
converse. 


Let p be an S-formula with at most one free variable x. By induction and 
the maximal consistency of 7 


% EF Axp p= ET forsomeceC eAxpEe TJ. 
x 


The hard work of this section leads to the fundamental theorem of model theory. 


@ THEOREM 7.4.19 [Henkin] 


An S-theory 7 is consistent if and only if 7 has a model. 


PROOF 

Let 7 be a set of S-sentences. Suppose 7 is consistent and let C be a set of 
constant symbols not found in 1S such that |C| = |L(S)|. By Lemma 74.13, there 
exists a consistent extension Z of TFs such that the elements of F are (SUC)- 
sentences and C is a witness set for Y. By Lemma 7.4.18, there exists a model 
M of F. Since the sentences of F do not contain any of the constants of C, the 
reduct of 2 to S is a model of Z. To see this, let B be the indicated reduct and 
proceed by induction on formulas. 


¢ Let to and t,; be S-terms such that (fg = t;) € F. These terms are also 
(SU C)-terms, so since FY C€ TF , we have that 2 F ¢, = fy. Thus, there is 
an (SUC)-interpretation J such that 2 F t,; = ft [1]. Hence, [(t,) = I(t,). 
Let I’ be the restriction of I to S. Since t, and f, contain no symbols from 
C, I'(t,) = I'(t,), so BE ty = ty [I’] because B is the reduct of 2 to S. 
By Theorem 7.1.31, we have that 8B F t, = fy. 


That R(fo,t),---.t,-1) € F implies BE R(t, ty, ...,t,-1) for any n-ary 
relation symbol R € S and S-terms fg, t,,...,1,_1 1s Exercise 8(a). 


Let ap € J. This implies that 2¢ — sp. That is, not 2 p. Since B is the 
reduct of 2f to S, not 8 F p. In other words, B F 7p. 


That (p > q) € F implies 8B F p — q for S-sentences p and q is Exer- 
cise 8(b). 


Section 7.4 THE THREE PROPERTIES REVISITED 407 


e Suppose that dxp € FY, where p = p(x) is an S-formula. Since C is a 
witness set and Y is maximally consistent, oo € & for some c € C. By 
induction, * 

bY A am 
x 
Let J be an interpretation of 2. Then, 2 F p [I ey, so we conclude that 
WE Axp. 


To prove the converse, suppose that 7  q A 7q for some S-sentence q. By 
the soundness theorem (7.4.9), J FE q An7q. Hence, if 2 is an S-structure such 
that 2X ZF, then WE g A 7g, which implies that 2 F q and not 2f F q. Hence, 
ZF does not have a model. 


Henkin’s theorem (7.4.19) is used to prove the converse to the soundness theorem 
(7.4.9). 


@ COROLLARY 7.4.20 [Completeness] 


Let 7 be a consistent set of S-sentences. For any S-sentence p, if 7 F p, then 
TF p. 


PROOF 
Suppose that 7 K p. This implies that 7 U {7p} is consistent, so by Henkin’s 
theorem (7.4.19), there exists an S-structure 2f such that 27 F U {7p}. Hence, 
Fp. 


M@ COROLLARY 7.4.21 


First-order logic is complete. 


The next corollary was first proved by Kurt Gédel (1929). It was his doctoral disserta- 
tion. 


@ COROLLARY 7.4.22 [Gédel’s Completeness Theorem] 


An S-sentence is a theorem if and only if it is valid. 


M@ COROLLARY 7.4.23 


For all sets of S-sentences 7 and S-sentences p, 7 + pif and only if F EF p. 


The use of models to show consistency is now a common technique in mathematical 
logic. An early example was Hilbert’s use of R x R to show that Euclidean geometry 
is consistent (Hilbert 1899). That is, R x IR serves as the domain of a structure that 
models the postulates of Euclidean geometry. The proof relies on the consistency of 
the properties of R x R, a fact that is left unproved. A later usage is the model used by 
Gédel that shows that both CH and the axiom of choice (5.1.10) are consistent with ZF 
(Godel 1940). Both are examples of relative consistency proofs. Gddel assumed the 
consistency of ZF, so he had a model for it. From this he defined another model that 


408 Chapter 7 MODELS 


satisfies CH and, in addition, the axiom of choice. We represent this by writing 
Con(ZF) — Con(ZFC + CH). 


This is a partial answer to one of the ten then unsolved problems that Hilbert pre- 
sented at the International Congress of Mathematicians at Paris in 1900. An expanded 
list was published that year in Germany with an English translation released two years 
later (Hilbert 1902). Hilbert’s intention was to outline the important problems that 
mathematicians should strive to prove in the twentieth century. The first problem in- 
volved the continuum hypothesis. Namely, it should be determined whether 

as regards equivalence there are ... only two assemblages of numbers, the 
countable assemblage and the continuum. 
Hilbert’s second problem dealt with the axioms of arithmetic. Specifically, he thought 
that any of the chosen axioms should not be provable from the others. That is, they 
should be independent. More importantly, mathematicians should seek 
[t]o prove that they [the axioms] are not contradictory, that is, that a finite 
number of logical steps based upon them can never lead to contradictory 
results. 


In this problem we see the notions of finite proof (Definition 1.2.13) and consistency 
(Definitions 1.5.1 and 7.4.2), which would influence to the development of model the- 
ory some 20-30 years after Hilbert’s talk and lead to some interesting results. 


Exercises 
1. Is @ consistent? Explain. 
2. Prove Theorem 7.4.3. 


3. Prove that the group axioms (7.1.14) and the ring axioms (Axioms 7.1.36) are con- 
sistent. 


4. From the proof of Lemma 7.4.17, prove that if f(fp,¢,,...,t,-1) = ¢ € J, then 
WE f (to. ty, ---+t,-1) = ¢ for all constant symbols c, n-ary function symbols f, and 
S-terms 19,1, --+ 5 ty—1- 
5. In Lemma 7.4.18, why can 7 be assumed to be maximally consistent? 


6. Let 2 be an S-structure. Define Th(2L) to be the set of S-sentences that are satisfied 
by 2. Prove that Th(2f) is maximally consistent. 


7. Prove the given results from the proof of Lemma 7.4.18. 
(a) If MW. F to = 2), then tg = t; € J for all S-terms fy and f, with no free 
variables. 
(b) For all S-terms fo, f; ...,¢,,; with no free variables, R(tp, 1) ...,f,-1) € F 
if Wl Rito, ty ...5t,-1)- 
(c) 2%. F p— q implies that p — q € J for all S-sentences p and q. 


8. Show the following from the proof of Henkin’s theorem (7.4.19). 
(a) If R@,...,t,) € J, then BE R(7t,,...,1,) for any relation symbol R € S 


and S-terms f,,...,t,,. 


Section 7.5 MODELS OF DIFFERENT CARDINALITIES 409 


(b) Ifp>qeE7Z, then 8 FE p — q for all S-sentences p and q. 
9. Prove Corollary 7.4.22. 


10. Let Z and @ be consistent sets of S-sentences. Prove that if 7 UY is inconsistent, 
then there exists an S-sentence p such that 7 F p but W@ F p. 


11. We say that an S-theory 7 is closed under deductions if for all S-sentences p, if 
Fe p,thnpeT. 
(a) Ifa theory is closed under deductions, is the theory maximally consistent? 
(b) Let # be an ordinal and suppose that {7, : a € f} is a family of S-theories 
such that for all y,6 € f, ify € 6, then Z, C F;. Prove that if F, is 


consistent for all a € f, then Ly. g Za is consistent. 


12. A theory 7 is finitely axiomatizable if there exists an S-sentence p such that for 
all S-sentences q, 7 F q if and only if p F q. Let {F, : n € w} bea chain of finitely 
axiomatizable S-theories such that m < n implies that 7,, C F,,. Suppose that for all 
n & @, there exists a model of 7, that is not a model of 7,,,;. Prove that Uc Tn is 
not finitely axiomatizable. 


neo 


13. Suppose that Y and J, are S-theories. Let 2f and B be S-structures. Prove the 
following. 

(a) If F, Fs, then every model of Y, is a model of J. 

(b) If & C B, then Th(B) € Th(2). 

(c) If AEF, UT), then WE J, and AE J. 

(d) M&F J, if and only if J, € Th). 


7.5 MODELS OF DIFFERENT CARDINALITIES 


The natural numbers and their basic operations of addition and subtraction can be de- 
fined using the axioms of ZFC (Section 5.2). It can then be proved that w has basically 
the same algebraic properties as N. Another approach to studying the natural numbers 
is to break down the subject to its most basic parts. Think about addition of natural 
numbers and how it is first explained. Adding 4 to 3 to obtain 7, for example, means 
starting at 3 and adding | four times in sequence: 


341=4, 
441=5, 
5+1=6, 
6+1=7. 


Adding 1 simply means moving to the next natural number, so to add any two natural 
numbers, all one needs is to know the numerals and have the ability to count (compare 
Definition 5.2.15). Although not efficient, this will do. Multiplication, the other oper- 
ation, is based on addition. Multiplying 4 by 3 means writing 4 down 3 times and then 
adding: 

4+44+4= 12. 


410 Chapter 7 MODELS 


This means that multiplication is also based on the ability to count (compare Defini- 
tion 5.2.18). This ability is represented by the successor function, 


Sn=n+l, 


which is the basis of arithmetic (compare Definition 5.2.1). Thus, to understand the 
natural numbers and how they work, we simply need a successor function that satisfies 
the right rules. 


Peano Arithmetic 


Using the theory symbols AR (Example 2.1.4), we make the following axioms. They 
are a minor modification of the axioms for the system of arithmetic found in Arith- 
metices principia by Giuseppe Peano (1889). 


M@ AXIOMS 7.5.1 
e Pl. Vx(- Sx = 0) 
e P2. VxVy(Sx = Sy > x=y) 
e P3. For every AR-formula p with free variable x, 
PO) A Vx[p(x) > pOSx)] > Vxp(x). 
Denote {P1, P2, P3} by P. If we define 


p(S) =*, 
pO) = 2, 
then Theorem 5.2.2, Theorem 5.2.6, and Corollary 5.2.12 imply that $B = (@, p) is a 


model of P. Because of the existence of a model for these axioms, which was con- 
structed using ZFC, we conclude the following (Theorem 7.4.19). 


(7.26) 


M@ THEOREM 7.5.2 


The consistency of P is a consequence of ZFC. 


Since the Peano axioms are intended to be the assumptions of number theory, AR is 
often extended to AR’ (Example 2.1.4) and the sentences of Axioms 7.5.1 broadened to 
include axioms involving +, -, and <. These axioms serve as the foundation of number 
theory. 


Hi AXIOMS 7.5.3 [Peano] 
e PAIL. Vx(7 Sx = 0) 
e PA2. VxVy(Sx = Sy > x=y) 


e PA3. Vx(x +0 = x) 


Section 7.5 MODELS OF DIFFERENT CARDINALITIES 411 


PAA. VxVy[x + Sy = S(x+ y)] 


PAS. Vx(x -0 = 0) 


PAO. VxVy(x-Sy=x-y+x) 


PAT. Wx(- x < 0) 


PA8. VxVy[x < Sy o (x <yVx=y)] 


PAS. VxVy(x < yVxX=yVY<x) 


PA10. For every AR’-formula p with free variable x, 


PO) AVx[p(x) > pOSx)] > Vxp(x). 


Let PA denote the Peano axioms (7.5.3). We call the set of consequences of the Peano 
axioms Peano arithmetic. To find a model for these axioms, extend the function p 
(7.26) to p’ so that p’(+) is the addition of Definition 5.2.15, p’(-) is the multiplication 
of Definition 5.2.19, and using the order of Definition 5.2.10, 


p'(<) = {(m,n) €oX@: men}. 


That p’(<) satisfies PA7—PA9 is left to Exercise 2. Then, as before, 8’ = (w, p’) is a 
model of the Peano axioms, which is an expansion of the model 8, and we can again 
apply Theorem 7.4.19 to obtain a consistency result. 


M@ THEOREM 7.5.4 


The consistency of PA is a consequence of ZFC. 


We call $B’ and any AR’-structure isomorphic to it a standard model of Peano arith- 
metic. The the elements of the domain of any standard model are called standard 
numbers. Any model of Peano arithmetic that is not isomorphic to 8’ is called a 
nonstandard model of Peano arithmetic. 

Observe that P and PA are sets of axioms separate from ZFC. However, because 
of the work of Section 5.2, both P and PA can be viewed as consequences of ZFC. 
Specifically, if P1 is replaced by 


Yx(x* # @), (7.27) 
P2 is replaced by 
VxVy(xt = yt + x= y), (7.28) 
and P3 is replaced by 
P(Z) AVx[p(x) > p(x*)] > Vxp(x) (7.29) 


for every ST-formula p with free variable x, then 


ZFC E {(7.27), (7.28), (7.27)}. 


412 Chapter 7 MODELS 


That is, a copy of P is found among the consequences of ZFC. Also, using Defini- 
tions 5.2.15 and 5.2.18, it can be shown that a copy of Peano arithmetic is found among 
the consequences of ZFC. 

We will not repeat all our number theory work in Peano arithmetic, although it is 
possible to do so. Instead, we give a sample of some basic results to illustrate that 
the two systems are the same. Because of the work in Section 5.2, the addition and 
multiplication of Peano arithmetic are commutative, associative, satisfy the distributive 
and cancellation laws, and have identities in the standard model. That these properties 
hold in every model of Peano arithmetic is a separate issue, yet their proofs are similar 
to the work of Section 5.2. 


M@ THEOREM 7.5.5 
The following AR’-sentences are theorems of PA. 


Associative Laws 
VxVyvz[x + (y+ z) =(xt+y)+Z] 
VxVyVz[x - (y+ Zz) =(x-y)- Zz] 


Commutative Laws 
VxVy(x + y= ytx) 
VxV (x+y = y-x) 


Additive Identity 
Vx(0+ x =x) 


Multiplicative Identity 
Vx(S0-x = x) 


Distributive Law 
VxVywz[x-(y+z)=x-yt+x-z] 


PROOF 
We prove the first associative law and the multiplicative identity property, leaving 
the others to Exercise 9. 


e Define 
P(x) :=Vz[x + (y+ Zz) = (x+y) +2]. 


By PA3, we have p(0) because 
x+(y+0) =(x+ y)+0. 
Next, assume p(k). Since S is a function symbol, PA4 yields 


S[xt+yt+kh]=S x+y) +k], 
x+ S(y+k)=(x+y)+ Sk, 
x+(y+ Sk) =(x+y)+ Sk. 


Section 7.5 MODELS OF DIFFERENT CARDINALITIES 413 


Therefore, p(.Sk), so the associative law for multiplication holds by PA10. 


¢ Assuming that the commutative laws have already been proved, PA6, PAS, 
and PA3 imply that 


SO-x=x-S0O=x-04x=04+x=x+0=x, 
so SO- x = x by E3 (Axioms 5.1.1). 


To ease our notation when dealing with numbers other than | = SO, define 


S°x =x (7.30) 
and for all n € @ \ {0}, 
S'=SS....Sx. (7.31) 
—— 
n times 


We will use this notation in the next example. 


EXAMPLE 7.5.6 
In the formula, 
(34+2)+1=(+4+3)42, (7.32) 


two properties from Theorem 7.5.5 are being applied. First, we conclude that 
(3+2)+1=1+(©+2) by the commutative law. The associative law is then 
used to draw the final conclusion. Notice that (7.32) is the standard interpretation 
of 

+455 S0S5S0S0 = +S05.5:80 + SSO, 


which, using (7.31), is equivalent to 
++:5°0570510 = +.5105°0 + S70. 
Also, the distributive law is applied in the formula 
2-x+3-x=5-x, 
because 


2-xt+3-x=x-24x-3=x-(24+3)=x-5=5-x. 


Given an equation like 


5-x+1=11, (7.33) 
the standard routine is to solve it like this: 
5-x+1=11, 
5-x= 10, 
x= 2. 


In the first step, —1 was added to both sides, and in the second, 1/5 was multiplied by 
both sides. However, only 0 has an additive inverse in Peano arithmetic, and only 1 has 
a multiplicative inverse. The way around this problem are cancellation laws. 


414 Chapter 7 MODELS 


Hi THEOREM 7.5.7 [Cancellation] 
The following AR’-sentences are theorems of PA. 
e VxVWz(x+z=y+Zz)>x=y] 
e VxVwWz[(x-z=y-ZA1z=0) > x=y]. 
PROOF 


The proof of the cancellation law for multiplication is Exercise 10. For addition, 
define 


P(Z) :=VxVy(x+z72=yt+z7Z27x=Yy). 


First, notice that if x +0 = y+0, 
x=x+0=y+0=y, 


where the first and last equality hold by PA3. Therefore, p(0). For the induction 
step, suppose p(k). Then, by PA4 and PA2, 
x+ Sk = y+ Sk, 
S(xt+k)= Sy +k), 
x+k=ytk, 
x = J; 
where the last step follows by p(k). Therefore, Vzp(z) by PA10. Hf 


Therefore, in Peano arithmetic, solve (7.33) like this: 


5-x+1l=11, 
5-x+1=10+4+1, 

5-x= 10, 
5-x=5-2, 

x S22 


The third and last equations follow by cancellation (Theorem 7.5.7). 


Compactness Theorem 


Having a model that satisfies P or PA means that basic arithmetic can be done in this 
model. We know that this happens in the standard model (assuming ZFC), so we now 
want to know whether there are any nonstandard models of Peano arithmetic. To be 
successful in our search, we turn to some theorems that follow from Henkin’s theorem 
(7.4.19). The first states that the existence of a model rests on the finite. It was first 
proved by Gédel for countable first-order theories (1930) and by Anatolij Mal’tsev for 


arbitrary first-order theories (1936). 


Section 7.5 MODELS OF DIFFERENT CARDINALITIES 415 


Hi THEOREM 7.5.8 [Compactness] 


An S-theory 7 has a model if and only if every finite subset of Y has a model. 


PROOF 
Since sufficiency is clear, let Y have the property that every finite subset has 
a model. This implies that every finite subset of Y is consistent by Henkin’s 
theorem. Therefore, Y is consistent by Theorem 7.4.3, so F also has a model 
by Henkin’s Theorem. 


We apply the compactness theorem to find another model of PA. Let c be a constant 
symbol other than 0. For every natural number n, use (7.30) and (7.31) to define the 
set of (AR’ U {c})-sentences 


F,=PAU{S'0<c:i€@Ari <n}, 


and let 2, = ({0, 1, ...,.2 +1}, a), where 


a[ AR’ =p’ 
and 
a(c) =n+1. 
Observe that 
W,F F,, 


so F, is consistent for all n € w, which implies that it has a model by Henkin’s Theorem 
(7.4.19). Therefore, 
F=% 


neo 


has a model by the compactness theorem (7.5.8). Let 2 be the reduct of this model 
to AR’. Notice that 21 — PA, but because it has an element that is interpreted to be 
greater than every element of its domain, 2 is a nonstandard model of Peano arithmetic 
(Theorem 7.3.24). 


Loéwenheim—Skolem Theorems 


Now that we know that a nonstandard model of Peano arithmetic exists under ZFC, 
we want to know if there are others. For this, we need a definition. The power of an 
S-structure 2 is denoted by |2{| and refers to the cardinality of the domain of 2. This 
means that a model is countable if and only if its power is countable. We introduce a 
sequence of theorems related to this. They are due to Skolem (1922), Tarski and Vaught 
(1957), and Leopold Léwenheim (1915). 


@ THEOREM 7.5.9 [Downward Léwenheim-Skolem] 


Every consistent S-theory has a model with power of at most |L(S)|. 


416 Chapter 7 MODELS 


PROOF 
Recall these facts from the proof of Henkin’s theorem (7.4.19). 


e C isa set of new constant symbols such that |C| = |L(S)|. 
e Misa (SU C)-structure. 
e 8 is the reduct of 2 to S. 


We can assume that 2f has the property that every element of the domain of 2 is 
the interpretation of a constant of C. Then, since 2 and 8 have the same power, 


|B] < |L(SUC)| = |L(S)|. Hf 


Since the witness set of a single sentence is finite and there are only finitely many 
symbols in a sentence, S U C can be assumed to be finite in the proof of the downward 
Léwenheim-Skolem theorem (7.5.9). If this is the case, |L(S UC)| = |L(S)| = Np, and 
we have the next corollary. 


H COROLLARY 7.5.10 [Léwenheim] 


If an S-sentence has a model, it has a countable model. 
Since L(AR’) is countable, Theorem 7.5.9 also implies the following. 


HM COROLLARY 7.5.11 [Skolem] 


ZFC implies that there is a countable nonstandard model of Peano arithmetic. 


In fact, although we do not prove it here, ZFC implies that there are 2X0 countable 
nonstandard models of Peano arithmetic. 
The title of Theorem 7.5.9 suggests the existence of the following theorem. 


@ THEOREM 7.5.12 [Upward Léwenheim-Skolem] 


If an S-theory has an infinite model, it has a model of cardinality « for every 
k > |L(S)|. 


PROOF 
Let F be an S-theory with an infinite model 2( = (A,a). Take x > |L(S)|. 


Choose a set C = {c, : a € xk} of distinct constant symbols not found in S. 
Extend 7 to the (S UC)-theory 7 by defining 


F=F Vey #q:aE PEK}. 
Let £ be a finite subset of Z. This implies that there exists 


WS 
C= {C53 Ca,9 +++ 9 


On-1 } = C 
such that the constants of the sentences of S are among the constants of C’. 
Expand the S-structure 2 to the (SU {c })-structure 2’ = (A, a’), 


ay? Sayre? <a, 


Section 7.5 MODELS OF DIFFERENT CARDINALITIES 417 


where a’ | S = a and 
a’ (cy) EA 


for alli = 0,1,...,n —1. Since A is infinite, we further assume that 
/ , / 
(Cg) O (Ca, Jo 22054 (Cg, _,) 


are distinct. Therefore, 2’ F S, so by the compactness theorem (7.5.8), there 
exists a model 8 of F, and due to the interpretation of the new constants, the 
power of % is x. Thus, the reduct of 8 to S is a model of Y and has power x. 


The upward Lowenheim-—Skolem theorem with ZFC implies that the Peano axioms 
have models of all infinite cardinalities, which are nonstandard by definition. 


The von Neumann Hierarchy 


We constructed a model of PA using ZFC. If we can find models of ZFC, we would 
have other models of PA, plus prove the consistency of ZFC. In order to find models 
of ZFC, we begin by searching for models of individual axioms using a definition due 
to von Neumann (1929). The objective of the definition is to construct a sequence 
of sets that have the property that every set is in one of the stages of the sequence. 
However, since von Neumann’s definition used functions instead of sets, it was Zermelo 
(1930) who gave it its more recognizable form. While von Neumann left his base stage 
empty, Zermelo allowed the first set of the sequence to contain objects, which are called 
urelements, that were not sets yet were allowed to be elements of sets (Exercise 20). 
These two approaches are combined in the following definition, which is named after 
von Neumann. 


@ DEFINITION 7.5.13 
Let Vo = ©. Let a@ be an ordinal. 
° Vat = P(V,). 
© Va = Upea Vp if @ is a limit ordinal. 
This is called the von Neumann hierarchy. 


As proved in Exercise 22, V, is a set for every ordinal a by the empty set, union, power 
set, and replacement axioms (5.1.2, 5.1.6, 5.1.7, 5.1.9). Thus, every element of V, is a 
set. For example, 


V, = {9}, 
V2 = {S, {O}}, 
V3 = {S, {OS}, {{S}}, {9, {S}}} 


The sets in V,, are finite and called the hereditarily finite sets. There are countably 
many of these sets because V,, is countable for each n € @ (Theorem 6.3.17), while 


418 Chapter 7 MODELS 


|V.541| = 20. Observe that the cardinality of V,, for each n € q is finite but grows very 
quickly. 


Vol = 0, 
[Val = 2”, 
2 
Vs) =2?, 
22 
[Vo] = 27° 


Before we can use individual stages of von Neumann’s hierarchy, we need to know 
some of their key properties. For this, we start with some lemmas. 


M@ LEMMA 7.5.14 
Let a and f be ordinals. 
¢ V, is a transitive set. 


Ifa C f, then V, € Vz. 


PROOF 
Let € be a limit ordinal containing a and f. Define 


A={n€¢:V, 1s transitive}. 
Assume that seg(A, 6) € A for the ordinal 6 € ¢. Let B € V5. 
¢ Vo is transitive because Vy = ©. 


¢ Suppose that 6 = y+ for some ordinal y. By definition, B € P(V,), so 
BCV,. Take x € B. This implies that x € V,. Because V, is transitive, 
we have that x € V,. Hence, x € P(V,) = V;, so BC V5. 


¢ Let 6 be a limit ordinal. Then, there exists y € 6 such that B € V,. Since 
V,, is transitive by hypothesis, B CV, C V5. 


We conclude that 6 € A. Thus, by transfinite induction (Theorem 6.1.18), A = ¢ 
and V, is transitive. 
Now, suppose a@ C f and take x € V,. Since 


Vy E P(V,) Cc Vp, 


V,, € Vz. However, the first part shows that Vz is transitive, so V, C Vp. 


Section 7.5 MODELS OF DIFFERENT CARDINALITIES 419 


@ LEMMA 7.5.15 


If @ is an ordinal, then a € V,+. 


PROOF 
Let a be an ordinal and take € to be a limit ordinal such that a € € and proceed 
by transfinite induction. 


e Since Vj) = ©, we have that @ € V,. 


¢ Suppose that a € V,+. Since V,+ is transitive (Lemma 7.5.14), a C Vj+. 
Also, {a} C V,+. Therefore, a U {a} CV,4+, so at € Via. 


¢ Let @ be a limit ordinal and assume that # € V,+ for all B € a. That is, 
B CV p+ for all £ € a since each Vp; is transitive. Hence, 


a=(Jecol ve =v, 


pea pea 
which implies that a € V,+. Hi 


The definition of the von Neumann hierarchy along with the fact that every ordinal 
belongs to a member of the hierarchy suggests that many, if not all, sets also belong to 
the hierarchy. For this reason, we define the following. 


@ DEFINITION 7.5.16 


Let V denote the collection of all sets A such that A € V, for some ordinal a. 


It is the case that all sets belong to the hierarchy. This is the next theorem. Its proof is 
aided by the use of two terms and a lemma. 


¢ A set A is grounded if there exists an ordinal a such that A C V,. 
¢ The transitive closure of A is 
TC(A) = {u: Vuo(A € VA vis transitive > u € v)}. 
As the name implies, TC(A) is a transitive set (Exercise 24). 
The proof of the next lemma is left to Exercise 21. 


LEMMA 7.5.17 


Every element of A is grounded if and only if A is grounded. 


M@ THEOREM 7.5.18 


For every set A, there exists an ordinal a such that A € V,. 


420 Chapter 7 MODELS 


PROOF 
Suppose that A is a set such that A ¢ V, for all ordinals a. If A € V, for some 
ordinal f, then A € Vp ,, which contradicts the hypothesis. Hence, A is not 
grounded, which implies that {A} is not grounded (Lemma 7.5.17). Define 


B= {ue TC({A}) : wis not grounded}. 


Since B # ©, the regularity axiom (5.1.15) implies that there exists C € B such 
that Cn B= @. Let x € C. Since transitive closures are transitive, x € TC(A). 
However, x ¢ B, so x is grounded. From this we conclude that C is grounded 
by Lemma 7.5.17, a contradiction. 


Therefore, every set is in V, but we know by Corollary 5.1.17 that V is not a set in that 
it cannot be built using ZFC. However, we sometimes want to refer to such collections 
even though they are not sets. For this reason the term class was introduced, so we call 
V the class of all sets. 

We are now ready to use sets from the von Neumann hierarchy to serve as models 
for axioms from ZFC (Section 5.1). 


M@ DEFINITION 7.5.19 
Let @ be an ordinal. Define the ST-structure B, = (V,, ©). 
Consider 


Bz = ({9, {S}, {{S}}, (9, {SP} }}, ©). (7.34) 


The elements of V3 are the sets of the model. These elements are equal exactly when 
they share the same elements (Definition 3.3.7), so 


%. F extensionality axiom. 
Because the union of any two elements of V3 is an element of V3, such as 
SGU{G} = {GP} 
and 
U{S}} US, {S}} = {9, {}}, 


we see that 
%, F union axiom. 


If we take a ST-formula p(x) and A € V3, then {x : x € AA p(x)} € V3. Hence, 
%  F subset axioms. 


Because V3 is finite, 
%  F axiom of choice, 


and we have that 
%3 F axiom of regularity 


since, for example, 
{PIN{{P}} =o. 


These results are particular examples of the next general theorem. 


Section 7.5 MODELS OF DIFFERENT CARDINALITIES 421 


@ THEOREM 7.5.20 


If a is an ordinal, 
¢ %, F extensionality axiom, 
¢ %, F union axiom, 
¢ %, F subset axioms, 
¢ %, F axiom of choice, 


e %, — axiom of regularity. 


PROOF 


Let a be an ordinal and J be an ST-interpretation of B,. We check that the 
extensionality axiom (5.1.4) holds in the model and leave the other parts of the 
proof to Exercise 25. Take A, B € V,. Assume that 


B, F Vulu € x ou € y) (A). 
That is, 
forallmeV,, B,Fuex K(CEa xo) if and only ifB, Huey (a 


We want to show that A = B, so leta € A. Since V, is transitive (Lemma 7.5.14), 
a € V,, which implies that 


B,Fuex i. 


Therefore, 
B, Fue yl yp yal, 


which implies that a € B. This proves that A C B. A similar proof shows that 
BCA. Thus, A = B, from which follows that 


ByFx=y (Dy. 
Therefore, 
if BF Vuuexouey) Lye); then B, EF x=y (12)? 1. 
In other words, 
B,FWuuex ouey)>x=y ley}. 
Since A and B were arbitrarily chosen, 


B, FE VxVy(wuluex eoueyl>x=y).m 


Again, using the ST-structure 83 (7.34), we see that the result of pairing two ar- 


bitrarily chosen elements of V3 into a single set might not be an element of V3. For 
example, 


({2}, {SO} }} E V3. 


422 Chapter 7 MODELS 


The same is true regarding power sets. For example, 
PU{SI}) = (9, ({Sh}} € V3. 


However, both of these sets are elements of V,,, which suggests that for the pairing 
(5.1.5) and power set (5.1.7) axioms to hold in stage V,, of the von Neumann hierarchy, 
a needs to be a limit ordinal. 


M@ THEOREM 7.5.21 
If a is a limit ordinal, 
¢ B, — pairing axiom, 
¢ %, F power set axiom. 


PROOF 
Let @ be a limit ordinal. Take a, b € V,. Then, there exists 6, 6. € a such that 
a € Vz, and b € V;,. Without loss of generality, we can assume that f, C fy, 
which implies that a € Vp, by Lemma 7.5.14. Therefore, {a,b} € Vp,» so 


{a,b} € P(V,;,) = V gs CV,. 


Hence, 
GB, FVuVvsAxVw(wexow=uVw=v). 


The proof of the second part of the theorem is left to Exercise 26. Mi 


Certainly, the empty set will be an element of V, provided that @ is not empty, so 
the proof of the next theorem is left to Exercise 27. In addition, V, needs to contain w 
to satisfy the infinity axiom. 


@ THEOREM 7.5.22 


If a is anonempty ordinal, 8, F empty set axiom. 


@ THEOREM 7.5.23 


If a is an ordinal such that w € a, then B, F infinity axiom. 


PROOF 
Since Lemma 7.5.15 implies that @ € V,,+, by Lemma 7.5.14, we have that 


BF ax({} exawuluex > dAyyexAuecyAVolvueu>vey)))). a 
Since w -2 = w+ aq is a limit ordinal greater than w, we conclude the following. 
@ THEOREM 7.5.24 

Bo.90 FZ. 


The ordinal @ + @ requires one of the replacement axioms (5.1.9) to prove its existence 
(page 317). This means that the proof of the consistency of Z relies on axioms not in 


Section 7.5 MODELS OF DIFFERENT CARDINALITIES 423 


Z, so let us continue to examine the von Neumann hierarchy for a model of ZFC. It 
will not be %,,.. because not all replacement axioms are true in %,,... This is because 
%.,.. does not satisfy Theorem 6.1.19. To use a stage of the von Neumann hierarchy to 
serve as a model for ZFC, we need a strongly inaccessible cardinal (Definition 6.5.11), 
which is the next theorem. We state it without proof. 


@ THEOREM 7.5.25 


%,. — ZFC if and only if « is a strongly inaccessible cardinal. 


However, the sentence 
there exists a strongly inaccessible cardinal (7.35) 


cannot be proved or disproved using ZFC. This means that (7.35) is independent of 
the axioms of set theory. It also means that we are ready for the next definition. Do not 
confuse it with the notion of a complete logic (Definition 7.4.8). 


HM DEFINITION 7.5.26 


An S-theory 7 is complete if + por Z + 7p for all S-sentences p, else F is 
incomplete. 


The definition means that ZFC is not complete. It turns out that the underlying issue 
is that ZFC satisfies the Peano axioms (7.5.3). The ability to do basic arithmetic guar- 
antees that there exists a sentence that is independent of ZFC. This result generalizes 
to the incompleteness theorems due to Gédel (1931). 


@ THEOREM 7.5.27 [Gédel’s First Incompleteness Theorem] 


If the Peano axioms are provable from a consistent theory, the theory is incom- 
plete. 


Since ZFC is assumed to be consistent and it can be used to deduce the Peano axioms, 
we conclude that ZFC is incomplete. Now suppose that instead of assuming the con- 
sistency of ZFC, we try to prove that ZFC is consistent using its own axioms. Gédel’s 
next theorem proves that we cannot do this, except under one condition. 


THEOREM 7.5.28 [Gédel’s Second Incompleteness Theorem] 


If a theory proves the Peano axioms and its own consistency, the theory is incon- 
sistent. 


Therefore, if ZFC could be used to prove (7.35), ZFC would prove that it has a model, 
which would imply that ZFC is consistent by Henkin’s theorem (7.4.19). Since we 
believe that ZFC is consistent, we conclude that ZFC cannot prove the existence of a 
strongly inaccessible cardinal. If ZFC was extended with an axiom that would allow 
such a proof, there would be another issue with the extension that would prevent it from 
proving its own consistency, provided that the new theory was consistent. 


424 Chapter 7 MODELS 


The statement of the first incompleteness theorem does not explicitly give a math- 
ematical statement that cannot be proved. It was left to later mathematicians to find 
some. For example, when combined with Gédel’s proof of the relative consistency of 
CH, the proof of Paul Cohen (1963) of 


Con(ZFC) —> Con(ZFC + =CH), 


shows that both CH and its negation cannot be proved from ZFC. This means that the 
continuum hypothesis is independent of ZFC. In general, to prove that a sentence p is 
independent of a theory 7, do two things. 


e Find a model 2 such that & FE F U {p}. Then, Z 7p by Definition 7.1.20. 
This implies that 7 K —p by Corollary 7.4.23. 


e Find another model B such that 8 EF F U{=p}. From this conclude that 7 F p, 
which implies FJ ¥ p. 


Using this strategy, other statements have been discovered to be independent of ZFC. 
Some of these are technical set theoretic statements such as Martin’s axiom (Martin and 
Solovay 1970) or the diamond principle (Jensen 1972). There are independent state- 
ments in other branches of mathematics as well. An example from group theory is the 
independence of the Whitehead problem (Shelah 1974), named after the mathematician 
John H. C. Whitehead (1950). 


Exercises 


1. Write an AR’-formula equivalent to the given English sentences. 
(a) x is an even number. 
(b) x is an odd number. 
(c) xis aprime. 
(d) x divides y. 


2. Prove that the order of Definition 5.2.10 proves the given AR’-sentences. 
(a) Vx(7 x <0) 
(b) VxVy[x < Sy o(x<yVx=y)] 
(c) VxVy(x<yVx=yVy<x). 


3. Prove that the given AR’-formulas can be proved from Axioms 7.5.3. 
(a) Vx(0<xVx=0) 
(b) Vx[x =O0V 4Ay(x = Sy)] 
(c) VWxVy(x-y=O7x=0Vy=0) 
(d) VxVy(x < yo Sx < Su) 
(e) VxVy(x<yVx=yVy<x) 


4. Prove PAF (x+2)-(x+3) = (x-x+5-x)+6, where 2, 3, 5, and 6 are understood 
to mean the appropriate successors of 0. 


Section 7.5 MODELS OF DIFFERENT CARDINALITIES 425 


5. Find examples of the following if possible. 
(a) An S-theory 7 such that every two finite models of Y are isomorphic, but 
there exists two models of 7 that are infinite and not isomorphic. 
(b) An S-theory 7 such that every two countable models of Y are isomorphic, 
but there exists two models of Y that are uncountable and not isomorphic. 


6. Let F be an S-theory and p an S-sentence where F | p. Prove that there exists a 
finite @ C F such that WF p. 


7. Let F be an S-theory and p an S-sentence such that 7 F p and pF 7. Show that 
there exists a finite subset @ of F such thatWE 7. 


8. Prove that the following are equivalent for an S-theory 7. 
¢ f is consistent. 
e ZF has a model. 
e ZF has acountable model. 
e Every finite subset of Y is consistent. 
e Every finite subset of Y has a model. 
e Every finite subset of Y has a countable model. 


9. Finish the proof of Theorem 7.5.5. 


10. Prove that the cancellation law holds for multiplication in Peano arithmetic. See 
Exercise 7.5.7. 


11. If possible, find an AR’-sentence p € Th(’) that is not a consequence of PA 
(Exercise 7.4.6). 


12. Demonstrate that there is no finite model of P or PA. 
13. Prove that there exists a model of P that is not isomorphic to . 
14. Prove that there is a countable nonstandard model of Peano arithmetic. 


15. The axioms for an ordered field are the ring axioms (7.1.36) plus the following 
OF -sentences: 

e VxVW(x @y=y@x), 

¢ AxVy(x @y=y), 

¢ Vx[>x =0 > Jy(x @ y= 1), 

© VxVy(x<yVx=yVy<x). 
Find a model for these axioms. 


16. Show that there exists a model of the axioms for an ordered field (Exercise 15) 
such that @ is a subset of the domain of the model and there exists m in the domain so 
that n < m for every natural number n. In addition, prove that there are infinitely many 
such m. What does 1/m look like? 


17. Let R = (R,0,+, -,<) be a model of the axioms for an ordered field. Prove that 
there is a countable model of Th(R). What is the significance of this model? 


426 Chapter 7 MODELS 


18. Expand OF to OF U {£}, where E is a binary function symbol. Find a model of 
the axioms for an ordered field including the following (OF U { £})-sentences: 

e Vx(x EO = SO) 

e VxVy(xEGSy) = (xEy)x 
19. Let 7 be a theory such that for every n € @, there exists m € w such that m > n 
and 7 has a model of power m. Prove that Z has an infinite model. 


20. This hierarchy is due to Zermelo. Let Vp be a set of atoms. For every ordinal a, 
define V,+ = V, UP(V,), and if @ is a limit ordinal, define V, = UJ p<a Vp. Find V; and 
V, assuming the given sets of atoms. 

(a) Y= 

(b) Yo = ta} 

(c) Vo = {0,1,2,3} 


21. Prove Lemma 7.5.17. 

22. Prove that V, is a set for every ordinal a. 

23. Using Exercise 6.3.22, show that [V,,,,| = 1, for every ordinal a. 

24. Let A be a set. Prove that TC(A) is a transitive set. 

25. Prove the remaining parts of Theorem 7.5.20. 

26. Let a be a limit ordinal. Prove that 8, F VxdyVu(u € y @ Vulv € u > v E x)). 
27. Prove that 8, F AxVy7(y € x) ifa # ©. 

28. Let 2f be an S-structure. Prove that Th(2) is complete. 


29. Let x be acardinal and {7, : a € «} be a chain of complete S-theories such that 
for all y € 6 € x we have that J, C Fs. Show that ),<, Fy is complete. 


ack 


30. Let S € & be S-theories. Prove or show false: If Z is complete, then S is 
complete. 


APPENDIX 
ALPHABETS 


Greek Alphabet 

Upper Lower Name 
A a alpha 
B B beta 
Tr Y gamma 
A 6 delta 
E € epsilon 
Z a zeta 
H n eta 
e 0 theta 
I 1 iota 
K K kappa 
A A lambda 
M ia mu 


Upper 


Lower 


Name 


Z 


OexKxeRHMVOOM 


SECxRGecriqgrvriownvre 


nu 
xi 
omicron 
pi 

rho 
sigma 
tau 
upsilon 
phi 

chi 

psi 
omega 


427 


A First Course in Mathematical Logic and Set Theory, First Edition. Michael L. O’ Leary. 
© 2016 John Wiley & Sons, Inc. Published 2016 by John Wiley & Sons, Inc. 


428 ALPHABETS 


English Alphabet in the Fraktur Font 


Upper Lower Name Upper Lower Name 


A 


08 
a 


Swmnnumeaenavass 
Saas aso ori rn ao 
SrAe4TMoaArmoaAD 
wes~R ESecrnaxnes 
weoseom Foecerwr ows 
NX x ES CHUDOWVOZ 


REFERENCES 


Aristotle (1984). The Complete Works of Aristotle, ed. J. Barnes. Princeton, NJ: Princeton 
University Press. 


Boole, G. (1847). The Mathematical Analysis of Logic : Being an Essay Towards a Calculus of 
Deductive Reasoning. Cambridge, UK: Macmillan, Barclay, & Macmillan. 


Boole, G. (1854). An Investigation of the Laws of Thought, on Which Are Founded the Mathe- 
matical Theories of Logic and Probabilities. London: Walton and Maberly. 


Boyer, C. B. and U. C. Merzbach (1991). A History of Mathematics (2nd ed.). New York: John 
Wiley & Sons, Inc. 


Burali-Forti, C. (1897). Una questione sui numeri transfiniti. Rendiconti del Circolo Matematico 
di Palermo 11(1), 154-164. 


Cantor, G. (1874). Ueber eine Eigenschaft des Inbegriffs aller reellen algebraischen Zahlen. 
Journal Fur Die Reine Und Angewandte Mathematik 77, 258-262. 


Cantor, G. (1888). Mitteilungen zur Lehre vom Transfiniten. Zeitschrift fiir Philosophie und 
Philosophische Kritik 91, 81-125. 


Cantor, G. (1891). Uber eine elementare Frage def Mannigfaltigkeitslehre. Jahresbericht der 
Deutschen Mathematiker-Vereinigung 1, 75-78. 


Cantor, G. (1932). Gesammelte Abhandlungen mathematischen und philosophischen Inhalts. 
Berlin: Springer-Verlag. 


429 


A First Course in Mathematical Logic and Set Theory, First Edition. Michael L. O’ Leary. 
© 2016 John Wiley & Sons, Inc. Published 2016 by John Wiley & Sons, Inc. 


430 REFERENCES 


Chang, C. C. and H. J. Keisler (1990). Model Theory (3rd ed.). Studies in Logic and the Foun- 
dations of Mathematics. Amsterdam: North Holland. 


Church, A. (1956). Introduction to Mathematical Logic. Princeton, NJ: Princeton University 
Press. 

Ciesielski, K. (1997). Set Theory for the Working Mathematician. London Mathematical Society 
Student Texts. Cambridge, UK: Cambridge University Press. 

Cohen, P. J. (1963). The independence of the continuum hypothesis. Proceedings of the National 
Academy of Sciences of the United States of America 50(6), 1143-1148. 

Copi, I. M. (1979). Symbolic Logic (Sth ed.). New York: Macmillan Publishing. 

Dauben, J. W. (1979). George Cantor: His Mathematics and Philosophy of the Infinite. Prince- 
ton, NJ: Princeton University Press. 

De Morgan, A. (1847). Formal Logic: or, the Calculus of Inference, Necessary and Probable. 
London: Taylor and Walton. 

Dedekind, R. (1893). Was sind und was sollen die Zahlen? Braunschweig: F. Vieweg. 

Dedekind, R. (1901). Essays on the Theory of Numbers, trans. W. W. Beman. Chicago: Open 
Court. 

Descartes, R. (1985). The Philosophical Writings of Descartes, eds. J. Cottingham, R. Stoothoff, 
and D. Murdoch. Cambridge, UK: Cambridge University Press. 

Doets, K. (1996). Basic Model Theory. Stanford: CSLI Publications. 


Drake, F. R. (1974). Set Theory: An Introduction to Large Cardinals, Volume 76 of Studies in 
Logic and Foundations of Mathematics. Amsterdam: North-Holland. 


Ebbinghaus, H., J. Flum, and W. Thomas (1984). Mathematical Logic. Undergraduate Texts in 
Mathematics. New York: Springer-Verlag. 

Ebbinghaus, H. and V. Peckhaus (2007). Ernst Zermelo: An Approach to His Life and Work. 
Berlin: Springer. 

Eklof, P. C. (1976). Whitehead’s problem is undecidable. American Mathematical Monthly 83, 
7715-788. 

Eklof, P. C. and A. H. Mekler (2002). Almost Free Modules: Set-theoretic Methods (revised ed.). 
Amsterdam: North-Holland. 

Enderton, H. B. (1977). Elements of Set Theory. San Diego: Academic Press. 

Euclid (1925). The Elements, ed. T. Heath. Reprint, New York: Dover, 1956. 

Ewald, W. B. (2007). From Kant to Hilbert: A Source Book in the Foundations of Mathematics. 
Oxford: Oxford University Press. 


Fibonacciand L. E. Sigler (2002). Fibonacci’s Liber Abaci: A Translation into Modern English of 
Leonardo Pisano’s Book of Calculuation. Sources and Studies in the History of Mathematics 
and Physical Sciences. New York: Springer. 


Fraenkel, A. A. (1922). Zu den Grundlagen der Cantor-Zermeloschen Mengenlehre. Mathema- 
tische Annalen 86, 230-237. 


Fraleigh, J. B. (1999). A First Course in Abstract Algebra (6th ed.). Reading, MA: Addison- 
Wesley. 


Frege, G. (1879). Begriffsschrift, Eine Der Arithmetischen Nachgebildete Formelsprache Des 
Reinen Denkens. Halle a/S. 


REFERENCES 431 


Frege, G. (1884). Die Grundlagen Der Arithmetik: Eine Logisch Mathematische Untersuchung 
Uber Den Begriff Der Zahl. Breslau: W. Koebner. 


Frege, G. (1893). Grundgesetze Der Arithmetik : Begriffsschriftlich Abgeleitet. Jena: H. Pohle. 
Godel, K. (1929). Uber die Vollstandigkeit des Logikkalkiils. Ph. D. thesis, University of Vienna. 


Gédel, K. (1930). Die Vollsténdigkeit der Axiome des logischen Functionenkalkiils. Monat- 
shefte fiir Mathematik und Physik 37, 349-360. 


Godel, K. (1931). Uber formal unentscheidbare Satze der Principia Mathematica und verwandter 
Systeme, I. Monatshefte fiir Mathematik und Physik 38, 173-198. 


Gédel, K. (1940). Consistency of the Continuum Hypothesis. Princeton, NJ: Princeton University 
Press. 


Halmos, P. R. (1960). Naive Set Theory. New York: Springer-Verlag. 


Henkin, L. (1949). The Completeness of Formal Systems. Ph. D. thesis, Princeton University, 
Princeton, NJ. 


Herrlich, H. (2006). Axiom of Choice. Lecture Notes in Mathematics. Berlin: Springer. 
Hilbert, D. (1899). Grundlagen der Geometrie. Leipzig: Teubner. 


Hilbert, D. (1902). Mathematical problems, trans. M. F. W. Newson. Bulletin of the American 
Mathematical Society 8, 437-479. 


Hodges, W. (1993). Model Theory. Encyclopedia of Mathematics and its Applications. Cam- 
bridge, U.K.: Cambridge University Press. 


Hofstadter, D. R. (1989). Gédel, Escher, Bach: an Eternal Golden Braid. Reprint, New York: 
Vintage Books. 


Jech, T. (1973). The Axiom of Choice, Volume 75 of Studies in Logic and the Foundations of 
Mathematics. Amsterdam: North-Holland. 


Jech, T. (2003). Set Theory: The Third Millennium Edition. Springer Monographs in Mathe- 
matics. Berlin: Springer. 


Jensen, R. B. (1972). The fine structure of the constructible hierarchy. Annals of Mathematical 
Logic 4(3), 229-308. 


Kaye, R. (1991). Models of Peano Arithmetic, eds. D. S. Angus Macintyre, John Shepherdson. 
Oxford, UK: Clarendon Press. 


Kline, M. (1972). Mathematical Thought from Ancient to Modern Times. New York: Oxford 
University Press. 


Kneale, W. and M. Kneale (1964). The Development of Logic. Oxford, UK: Clarendon Press. 
Konig, J. (1905). Zum Kontinuum-problem. Mathematische Annalen 60(2), 177-180. 


Konig, J. (1906). Sur la théorie des ensembles. Comptes rendus hebdomadaires des séances de 
l’Académie des sciences 143, 110-112. 


Kunen, K. (1990). Set Theory: An Introduction to Independence Proofs, Volume 102 of Studies 
in Logic and the Foundations of Mathematics. Amsterdam: North-Holland. 


Kuratowski, K. (1921). Sur la notion de l’order dans la théorie des ensembles. Fundamenta 
Mathematicae 2, 161-171. 


Kuratowski, K. (1922). Sur l’opération a de I’ analysis situs. Fundamenta Mathematicae 3, 182- 
199. 


432 REFERENCES 


Leary, C. C. (2000). A Friendly Introduction to Mathematical Logic. Upper Saddle River, NJ: 
Prentice Hall. 


Leibniz, G. W. (1666). Dissertatio de arte combinatoria, in qua ex arithmeticae fundamentis 
complicationum ac transpositionum doctrina novis praeceptis extruitur, & usus ambarum per 
universum scientiarum orbem ostenditur; nova etiam artis meditandi, seu logicae inventionis 
semina sparguntur. Lipsiae, apud Joh. Simon Fickium et Joh. Polycarp. Seuboldum, Literis 
Sporelianis. 

Levy, A. (1979). Basic Set Theory. Perspectives in Mathematical Logic. Berlin: Springer-Verlag. 


Loéwenheim, L. (1915). Uber Méglichkeiten im Relativkalkiil. Mathematische Annalen 76(4), 
447-470. 


Lukasiewicz, J. (1930). Elementy logiki matematycznej. Warsaw: s.n. 


Lukasiewicz, J. (1951). Aristotle’s Syllogistic From the Standpoint of Modern Formal Logic. 
Oxford, UK: Clarendon Press. 


Mal’tsev, A. I. (1936). Untersuchungen aus dem Gebiete der mathematischen Logik. Matem- 
aticheskii Sbornik 1(43), 323-336. 


Martin, D. A. and R. M. Solovay (1970). Internal Cohen extensions. Annals of Mathematical 
Logic 2(2), 143-178. 


Mirimanoff, D. (1917). Les antinomies de Russel et de Burali-Forti et le probléme fondamental 
de la théorie des ensembles. L’Enseignement Mathématique 19, 37-52. 


Pascal, B. (1665). Traité du triangle arithmetique, avec quelques autres petits traitez sur la 
mesme matriére. Paris: chez G. Desprez. 


Peano, G. (1889). Arithmetices principia: nova methodo. Augustae Taurinorum [Torino]: 
Fratres Bocca. 


Reid, C. (1996). Hilbert. New York: Copernicus. 


Rosen, K. H. (1993). Elementary Number Theory and Its Applications (3rd ed.). Reading, MA: 
Addison-Wesley. 


Rubin, H. and J. E. Rubin (1985). Equivalents of the Axiom of Choice, II, Volume 116 of Studies 
in Logic and the Foundations of Mathematics. Amsterdam: North-Holland. 


Rubin, J. E. (1973). The compactness theorem in mathematical logic. Mathematics Maga- 
zine 46(5), 261-265. 


Shelah, S. (1974). Infinite abelian groups, Whitehead problem and some constructions. Isreal 
Journal of Mathematics 18(3), 243-256. 


Skolem, T. (1922). Einige Bemerkungen zur axiomatischen Begriindung der Mengenlehre. In 
Proceedings of the 5th Scandinavian Mathematicians’ Congress in Helsinki, pp. 217-32. 


Suppes, P. (1972). Axiomatic Set Theory. New York: Dover Publications. 


Tarski, A. (1935). Der wahrheitsbegriff in den formalisierten sprachen. Studia Philosophica 1, 
261-405. 


Tarski, A. (1983). Logic, Semantics, Metamathematics: Papers from 1923 to 1938, trans. J. H. 
Woodger. Indianapolis: Hackett Publishing Company. 


Tarski, A. and R. L. Vaught (1957). Arithmetical extensions of relational systems. Compositio 
Mathematica 13, 81-102. 


van Dalen, D. (1994). Logic and Structure (3rd ed.). Berlin: Springer-Verlag. 


REFERENCES 433 


van Dalen, D., H. C. Doets, and H. de Swart (1978). Sets: Naive, Axiomatic and Applied, Volume 
106 of International Series in Pure and Applied Mathematics. Oxford, UK: Pergamon Press. 


Van Heijenoort, J. (1971). Frege and Gédel: Two Fundamental Texts in Mathematical Logic. 
Cambridge MA: Harvard University Press. 


Van Heijenoort, J. (1977). From Frege to Gédel: A Source Book in Mathematical Logic, 1879- 
1931. Source Books in History of Sciences. Cambridge, MA: Harvard University Press. 


Venn, J. (1894). Symbolic Logic (2nd ed.). London: Macmillan. 


von Neumann, J. (1923). Zur Einfiihrung der transfiniten Zahlen. Acta literarum ac scien- 
tiarum Regiae Universitatis Hungaricae Francisco-Josephinae, Sectio scientiarum mathe- 
maticarum I, 199-208. 


von Neumann, J. (1928). Uber die Definition durch transfinite Induktion und verwandte Fragen 
der allgemeinen Mengenlehre. Mathematische Annalen 99, 373-391. 


von Neumann, J. (1929). Uber eine Widerspruchsfreiheitsfrage der axiomatischen Mengenlehre. 
Journal fiir die reine und angewandte Mathematik 160, 227-241. 


Whitehead, A. N. and B. Russell (1910). Principia Mathematica (2nd ed.). Cambridge, UK: 
Cambridge University Press. 


Whitehead, J. H. C. (1950). Simple homotopy types. American Journal of Mathematics 72(1), 
1-57. 


Wussing, H. (1984). The Genesis of the Abstract Group Concept, ed. H. Grant, trans. A. Shen- 
itzer. Cambridge, MA: MIT Press. 


Zermelo, E. (1908). Untersuchungen iiber die grundlagen der mengenlehre. Mathematische 
Annalen 65, 261-281. 


Zermelo, E. (1930). Uber Grenzzahlen und Mengenbereiche. Neue Untersuchungen iiber die 
Grundlagen der Mengenlehre. Fundamenta mathematicae 16, 29-47. 


Zorn, M. (1935). A remark on method in transfinite algebra. Bulletin of the American Mathe- 
matical Society 41, 667-670. 


INDEX 


Abel, Niels, 342 
Abelian group, 342 
Absolute value, 116 
Abstraction, 122 
Addition, inference rule, 25 
Addition, matrix, 355 
Additive identity, 199, 412 
Additive inverse, 199 
Aleph, 313 
Algebraic, 315 
Alphabet, 5 
first-order, 68 
second-order, 73 
And, 4 
Antecedent, 4 
Antichain, 183 
Antisymmetric, 177 
Arbitrary, 88 
Argument form, 21 
Assignment, 7 
Associative, 34, 139, 199, 412 
Assumption, 46 
Asymmetric, 177 
Atom, 3, 6 
Automorphism, 380 


Axiom scheme, 228 
Axiom(s), 24 
choice, 231, 235 
empty set, 227 
equality, 226 
extensionality, 227 
foundation, 234 
Frege—Lukasiewicz, 24 
group, 340 
paring, 228 
power set, 228 
regularity, 234 
replacement, 230 
ring, 353 
separation, 228 
subset, 229 
union, 228 
Zermelo, 231 
Axiomatizable, finitely, 409 


Basis case, 258 
Bernstein, Felix, 301 
Beth, 316 
Biconditional, 5 
Biconditional proof, 107 


435 


A First Course in Mathematical Logic and Set Theory, First Edition. Michael L. O’Leary. 
© 2016 John Wiley & Sons, Inc. Published 2016 by John Wiley & Sons, Inc. 


436 INDEX 


Bijection, 211 
Binary, 68 
function, 189 
operation, 198 
relation, 161 
Binomial coefficient, 263 
Binomial theorem, 263 
Boole, George, 117 
Bound 
lower, 181 
upper, 180 
Bound occurrence, 77 
Burali-Forti paradox, 297 
Burali-Forti theorem, 292 


Cancellation, 248, 256, 414 
Candidate, 99 
Cantor—Schréder—Bernstein theorem, 301 
Cantor, Georg, 117, 226, 229, 303, 313 
Cardinal, 307 

large, 332 

limit, 314 

regular, 328 

singular, 328 

strongly inaccessible, 332 

successor, 314 

weakly inaccessible, 331 
Cardinality, 308 
Cartesian n-space, 131 
Cartesian plane, 130 
Cartesian product, 130 
Cases, 112 
Chain, 182 

elementary, 394 

of structures, 372 
Characteristic function, 304 
Choice axiom, 231, 235 
Choice function, 231 
Class, 420 
Class (relation), 171 
Closed, 198 
Closed interval, 120 
Closed under deductions, 409 
Codomain, 190 
Cofinal, 328 
Cofinality, 328 
Cohen, Paul, 424 
Coincidence, 349 
Combinatorics, 260 
Common divisor, 140 
Commutative, 34, 139, 199, 412 
Commutative ring, 354 
Compactness theorem, 52, 415 
Comparable, 181 
Compatible, 183 


Complement, 128 
Complete, 56, 398 
Complete theory, 423 
Completeness theorem, 61, 407 
Gédel’s, 407 
Completeness, real numbers, 275 
Complex number, 281 
Composite number, 107 
Composition, 163, 195 
Compound, 4, 6 
Concatenation, 178 
Conclusion, 22, 26 
Conditional, 4 
Conditional proof, 45 
Congruent, 170 
Conjunct, 4 
Conjunction, 4 
Conjunction, inference rule, 25 
Connective, 6 
Consequence, 22, 346, 352 
Consequent, 4 
Consistency, relative, 407 
Consistent, 52, 395 
maximally, 53, 396 
Constant symbol, 68 
Constructive dilemma, 25 
Contingency, 16 
Continuous, 94 
Continuum hypothesis, 313 
generalized, 314 
Contradiction, 16 
Contradiction, proof by, 47 
Contrapositive, 32, 103 
Contrapositive law, 34 
Converse, 32 
Coordinates, 130 
Coordinatewise, 356 
Copy, 214 
Corollary, 20 
Corresponding occurrences, 77 
Countable, 310 
Counterexample, 102 
Cyclic group, 363, 365 


De Morgan, Augustus, 117 
De Morgan’s laws, 35, 137, 139 
Decreasing, 184, 203 
Dedekind cut, 276 
Dedekind, Richard, 229, 276 
Deduce, 26 

Deduction, 20 

Deduction theorem, 42 
Deductive logic, 2 

Dense, 257 

Denumerable, 310 


Descartes, René, 117 
Destructive dilemma, 25 
Diagonalization, 303 
Diamond principle, 424 
Direct existential proof, 99 
Direct proof, 45 
Discrete, 310 
Disjoint, 127, 155 
pairwise, 155 
Disjunct, 4 
Disjunction, 4 
Disjunctive normal form, 38 
Disjunctive syllogism, 25 
Distributive, 34, 139, 246, 323, 412 
left, 320 
Divides, 98 
Divisible, 98 
Division algorithm, 185 
polynomial, 110 
Division ring, 357 
Divisor, 98 
common, 140 
zero, 356 
Domain, 162, 334 
Dominate, 300 
Double negation, 35 
Downward closed, 274 


Downward Léwenheim-—Skolem theorem, 415 


Element, 117 
Elementary chain, 394 
Elementary equivalent, 387 
Elementary extension, 389 
Elementary substructure, 389 
Embedding, 214 
Empty set, 118, 141, 227 
Empty set axiom, 227 
Empty string, 6 
Endpoint, 120 
Equal, 118 
Equality, 194 
Equality axioms, 226 
Equality symbol, 68 
Equinumerous, 298 
Equivalence class, 171 
Equivalence relation, 169 
induced, 175 
Equivalence rule, 110 
Equivalence, logical, 31, 348 
Equivalent 
elementary, 387 
pairwise, 109 
Euclid, 19, 107, 265 
Euclid’s lemma, 265 
Evaluation map, 193 


INDEX 


Even integer, 98 

Excluded middle, 37 

Exclusive or, 11 

Existence, 104 

Existential formula, 72 

Existential generalization, 91 

Existential instantiation, 91 

Existential proof, 99, 106 

Existential proposition, 65 

Existential quantifier, 65 

Expansion, 352 

Exponentiation, 249, 321, 322 

Exportation, 35 

Extension, 197, 361 
elementary, 389 

Extensionality axiom, 227 


Factor, 98, 110 
Factorial, 242 
False, 3 
Family of sets, 148 
Fibonacci, 268 
Fibonacci number, 269 
Fibonacci sequence, 269 
generalized, 273 
Field, 357 
ordered, 69, 425 
Field theory, 69 
Finite, 308 
hereditarily, 417 
Finitely axiomatizable, 409 
First-order 
alphabet, 68 
formula, 73 
language, 73 
logic, 96 
Formal proof, 26 
Formation sequence, 7 
Formula, 71, 72 
first-order, 73 
second-order, 73 
Foundation axiom, 234 
Fraenkel, Abraham, 229 
Fraktur, 334, 428 
Free occurrence, 77 
Free variable, 78 
Frege, Gottlob, 24, 117 
Function, 189 
bijection, 211 
binary, 189 
characteristic, 304 
choice, 231 
continuous, 94 
decreasing, 203 
embedding, 214 


437 


438 INDEX 


evaluation, 193 
greatest integer, 192 
homomorphism, 375 
identity, 191 
inclusion, 203 
increasing, 203 
injection, 206 
inverse, 204 
invertible, 204 
isomorphism, 382 
one-to-one, 206 
one-to-one correspondence, 211 
onto, 208 
order-preserving, 212 
periodic, 203 
projection, 210 
real-valued, 193 
surjection, 208 
unary, 189 
uniformly continuous, 94 
zero, 376 
Function equality, 194 
Function notation, 190 
Function symbol, 68 
Fundamental homomorphism theorem, 383 
Fundamental theorem of arithmetic, 271 


Galois, Evariste, 342 
General linear group, 345 
Generalization, 85 

existential, 91 

universal, 88 
Generalized continuum hypothesis, 314 
Generalized Fibonacci sequence, 273 
Generator, 363, 370 
Gédel, Kurt, 399, 407, 414, 423 
Gédel’s completeness theorem, 407 
Gédel’s incompleteness theorems, 423 
Golden ratio, 271 
Grammar, 6 
Greatest common divisor, 140 
Greatest element, 180 
Greatest integer function, 192 
Greatest lower bound, 181 
Grounded, 419 
Group, 342 

abelian, 342 

cyclic, 363 

general linear, 345 

Klein-4, 344 

simple, 363 
Group axioms, 340 
Group theory, 69 
Grouping symbol, 6 


Half-open interval, 120 
Hartogs’ function, 327 
Hartogs’ theorem, 293 
Hausdorff maximal principle, 237 
Henkin, Leon, 399 
Henkin’s theorem, 406 
Hereditarily finite sets, 417 
Hereditary set, 235 
Hierarchy, von Neumann, 417 
Hilbert, David, 226, 408 
Hilbert’s problems, 408 
Homomorphism, 375 

group, 375 

ring, 375 
Hypothetical syllogism, 25 


Ideal, 368 
improper, 368 
left, 368 
maximal, 374 
prime, 374 
principal, 370 
principal left, 370 
proper, 368 
right, 368 
Idempotent laws, 139 
Identity, 162, 199 
additive, 199 
multiplicative, 199 
Identity map, 191 
Image, 190, 208, 216 
Implication, 4 
Improper ideal, 368 
Improper subgroup, 363 
Improper subring, 366 
Improper subset, 136 
Inaccessible cardinal, 331 
Inclusion map, 203 
Inclusive or, 11 
Incomparable, 181 
Incompatible, 183 
Incomplete theory, 423 
Incompleteness theorems, 423 
Inconsistent, 52, 395 
Increasing, 184, 203 
Independent, 408 
Index, 148 
Index set, 148 
Indexed, 149 
Indirect existential proof, 106 
Indirect proof, 47 
Induced equivalence relation, 175 
Induced partition, 174 
Induction 
mathematical, 257 


on formulas, 336 
on propositional forms, 59 
on terms, 335 
strong, 268 
transfinite, 283, 291 

Induction hypothesis, 59, 258 

Induction step, 258 

Inductive logic, 1 

Inductive set, 238 

Infer, 24 

Inference, 24 

Inference rule, 25 

Infinite, 308, 344 

Infinity axiom, 227 

Infinity symbol, 120 

Initial number, 327 

Initial segment, 274 
proper, 274 

Injection, 206 

Instantiation, 85 
existential, 91 
universal, 87 

Integers, 250 

Integral domain, 356 


International Congress of Mathematicians, 408 


Interpretation, 335 
Intersection, 126, 152 
Interval, 120 
Interval notation, 120 
Introduction, 97 
Invalid 
semantically, 21 
syntactically, 21 
Inverse, 166, 199 
additive, 199 
function, 204 
image, 216 
left, 215 
multiplicative, 199 
right, 216 
Inverse image, 216 
Inverse relation, 166 
Inverse statement, 37 
Invertible function, 204 
Invertible matrix, 345 
Irrational number, 128, 276 
Irreflexive, 177 
Isomorphic, 212, 344, 380, 382 
Isomorphism, 212, 380 
group, 382 
ring, 382 


Kernel, 379 
Klein-4 group, 344, 363 
K6nig, Julius, 301, 330 


INDEX 


KGnig’s theorem, 330 
Kuratowski, Kazimierz, 130, 232, 237 


Language, 73 
first order, 73 
Large cardinal, 332 
Law of noncontradiction, 37 
Law of the excluded middle, 37 
Least element, 180 
Least upper bound, 180 
Left distributive, 320 
Left ideal, 368 
Left inverse, 215 
Left zero divisor, 356 
Leibniz, Gottfried, 117 
Lemma, 20 
Leonardo of Pisa, 268 
Lexicographical order, 187 
Liber abaci, 268 
Limit cardinal, 314 
Limit ordinal, 290 
Lindenbaum’s theorem, 397 
Linear combination, 186 
Linear order, 182 
Linearly ordered set, 182 
Logic, 1 
deductive, 2 
first-order, 96 
inductive, 1 
mathematical, 2 
propositional, 20 
second-order, 96 
Logic symbol, 67 
Logical implication, 21 
Logical system, 20 
Logically equivalent, 31, 348 
Léwenheim, Leopold, 415 
Loéwenheim’s theorem, 416 
Léwenheim-Skolem theorem 
downward, 415 
upward, 416 
Lower bound, 181 
greatest, 181 
Lukasiewicz, Jan, 24, 71 


Mal’tsev, Anatolij, 414 
Map, 190 

Martin’s axiom, 424 
Material equivalence, 13, 35 
Material implication, 12, 35 
Mathematical induction, 257 
strong, 268 
Mathematical logic, 2 
Mathematics, 1 

Mathesis universalis, 117 


439 


440 INDEX 


Matrix, 345 partial, 178 
identity, 345 well, 183 
invertible, 345 Order isomorphism, 212 
zero, 355 Order of connectives, 8, 9 

Matrix addition, 355 Order of operations, 132 

Matrix multiplication, 345 Order type, 212, 292 

Maximal element, 181 Order-preserving, 212 

Maximal ideal, 374 Ordered n-tuple, 131 

Maximally consistent, 53, 396 Ordered field, 69, 425 

Meaning, 334 Ordered pair, 130 

Metatheory, 96 Ordinal, 286 

Minimal element, 181 limit, 290 

Mirimanoff, Dmitry, 229, 234 successor, 290 

Model, 338, 339 Ordinal number, 286 

Modern square of opposition, 86 

Modulo, 170, 172 Pairing axiom, 228 

Modus ponens, 25 Pairwise disjoint, 155 

Modus tolens, 25 Pairwise equivalent, 109 

Multiple, 98 Paragraph proof, 97 

Multiplication, matrix, 345 Parsing tree, 6 

Multiplicative identity, 199, 412 Partial order, 178 

Multiplicative inverse, 176, 199 Partially ordered set, 178 

Mutually exclusive, 127 Particular, 88 

Partition, 173 

N-ary, 68, 161, 189 induced, 174 

Name, 334 Pascal’s identity, 263 

Natural number, 238 Peano arithmetic, 69, 411 

Necessary, 4, 109 Peano axioms, 410 

Negated, 4 Peano, Giuseppe, 410 

Negation, 4 Periodic, 203 

Niece, 64, 161 Permutation, 260 

Noncontradiction law, 37 Pigeonhole principle, 309 

Nonstandard model, 411 Polynomial division algorithm, 110 

Norway, 342 Poset, 178 

Number, 256 Positive form, 86 
standard, 411 Postulate, 20 

Number theory, 69 Power, 415 

Power set, 151 

Occurrence, 65 Power set axiom, 228 

Odd integer, 98 Pre-image, 190 

One-to-one correspondence, 211 Predecessor, 237 

One-to-one function, 206 Predicate, 64, 65 

Onto function, 208 Prefix notation, 71 

Open interval, 120 Premise, 22, 26 

Operation, binary, 198 Prenex normal form, 96 
associative, 199 Preserves, 212, 375 
commutative, 199 Prime ideal, 374 

Operation, set, 126 Prime number, 107 

Or, 4 Prime power decomposition, 272 
exclusive, 11 Principal ideal, 370 
inclusive, 11 left, 370 

Order, 344 Principal ideal domain, 372 
increasing, 184 Projection, 210 
lexicographical, 187 Proof methods 


linear, 182 biconditional, 107 


biconditional, short rule, 108 
cases, 112 
conditional, 45 
contradiction, 47 
counterexample, 102 
direct, 45 
disjunctions, 111 
equivalence rule, 110 
existence, 104 
existential, direct, 99 
existential, indirect, 106 
indirect, 47 
reductio ad absurdum, 47 
relative consistency, 407 
uniqueness, 104 
universal, 98 
Proof, formal, 26 
Proof, paragraph, 97 
Proof, two-column, 27 
Proper ideal, 368 
Proper initial segment, 274 
Proper subgroup, 364 
Proper subring, 366 
Proper subset, 136 
Proposition, 3 
compound, 4 
universal, 66 
Proposition alphabet, 5 
Propositional form, 6 
Propositional logic, 20 
Propositional variable, 6 
Prove, 26 
Pure set, 235 
Purely relational, 337 


Quantifier, 65, 66 
Quantifier negation, 86 
Quantifier symbol, 68 
Quotient, 110, 185 
Quotient set, 172 


Range, 162, 208 
Rational numbers, 254 
Ray, 120 
Real number, 276 
Real-valued function, 193 
Recursion, 242 

transfinite, 294 
Recursive definition, 6, 242 
Reduct, 352 
Reductio ad absurdum, 47 
Reflexive, 169 
Regular cardinal, 328 
Regularity axiom, 234 
Relation, 161 


INDEX 


antisymmetric, 177 
asymmetric, 177 
binary, 161 
inverse, 166 
irreflexive, 177 
reflexive, 169 
symmetric, 169 
transitive, 169 
unary, 161 
Relation on, 161 
Relation symbol, 68 
Relative consistency, 407 
Remainder, 110, 185 
Replacement axiom, 230 
Replacement rule, 34 
Restriction, 197 
Right ideal, 368 
Right inverse, 216 
Right zero divisor, 356 
Ring, 354 
commutative, 354 
division, 357 
field, 357 
integral domain, 356 
principal ideal domain, 372 
skew field, 357 
with unity, 354 
Ring axioms, 353 
Ring theory, 69 
Roster, 118 
Roster method, 118 
Russell’s paradox, 225 
Russell, Bertrand, 225 


Satisfiable, 340, 352 
Satisfy, 65, 338, 339 
Scheme, axiom, 228 
Schroder, Ernst, 301 
Scope, 77 
Second-order 

alphabet, 73 

formula, 73 
Second-order logic, 96 
Selector, 231 
Self-evident, 24 
Semantically invalid, 21 
Semantically valid, 21, 22 
Semantics, 21 
Sentence, 83 
Separation axioms, 228 
Set, 117 

hereditary, 235 
Set difference, 127 
Set operation, 126 
Set theory, 68 


441 


442 INDEX 


Short rule of biconditional proof, 108 


Signature, 334 
Simple group, 363 
Simplification, 25 
Simultaneous substitution, 81 
Singleton, 118 
Singular cardinal, 328 
Skew field, 357 
Skolem’s theorem, 416 
Skolem, Thoralf, 229, 234, 415 
Sound, 56, 57, 398 
Soundness theorem, 57, 398 
Square of opposition, modern, 86 
Standard model, 411 
Standard number, 411 
String, 5 
Strong induction, 268 
Strongly inaccessible, 332 
Structure, 333 

pureley relational, 337 
Subgroup, 363 

improper, 363 

proper, 364 

trivial, 363 
Subject, 64 
Subproof, 46 
Subring, 366 

improper, 366 

proper, 366 

trivial, 366 
Subset, 135 
Subset axioms, 229 
Substitution, 64, 75, 78 

simultaneous, 81 
Substructure, 361 

elementary, 389 
Successor, 237 
Successor cardinal, 314 
Successor ordinal, 290 
Sufficient, 4, 109 
Surjection, 208 
Symmetric, 169 
Symmetric closure, 177 
Symmetric difference, 236 
Syntactically invalid, 21 
Syntactically valid, 21, 27 
Syntax, 23 


Tarski, Alfred, 336, 415 
Tarski-Vaught theorem, 390 
Tautology, 16 

Tautology rule, 35 

Term, 70 

Theorem, 20, 27 

Theory, 395 


Theory symbol, 68 
Tower, 233 
Transcendental, 315 
Transfinite induction, 283, 291 
Transfinite recursion, 294 
Transitive, 169 
Transitive closure, 419 
Transitive set, 239 
Trichotomy, 183, 288 
weak, 179 
Trichotomy law, 183 
Trivial subgroup, 363 
Trivial subring, 366 
True, 3 
Truth table, 11 
Truth value, 3 
Two-column proof, 27 


Unary, 68 
function, 189 
relation, 161 
ndecidable, 313 
ndefined, 190 
niformly continuous, 94 
nion, 126, 151 
nion axiom, 228 
nique, 104 
nique factorization theorem, 271 
nit, 357 
nity, 354 
niversal formula, 72 
niversal generalization, 88 
niversal instantiation, 87 
niversal proof, 98 
niversal proposition, 66 
niversal quantifier, 66 
niverse, 97, 141 
lpper bound, 180 
least, 180 


GGG GC Ce Ge EGS) Ge eect 


Gc 


coal 


relement, 417 


Valid, 347 
semantically, 21, 22 
syntactically, 21, 27 
Valuation, 10, 13 
Valuation patterns, 15 
Variable, 67 
propositional, 6 
Variable symbol, 67 
Vaught, Robert, 336 
Venn diagram, 127 
Venn, John, 127 
Vertical line test, 189 
Von Neumann hierarchy, 417 


jpward Lowenheim-—Skolem theorem, 416 


Von Neumann, John, 234, 238, 417 


Weak-trichotomy law, 179 
Weakly inaccessible, 331 
Well-defined, 191 
Well-ordered set, 183 
Well-ordering theorem, 293 
Whitehead, John H. C., 424 
Whitehead problem, 424 
Witness, 91 

Witness set, 399 

Write, 23 


Zermelo axioms, 235 
Zermelo’s axiom, 231 
Zermelo’s theorem, 295 
Zermelo, Ernst, 225, 234, 293, 417 
Zermelo—Fraenkel axioms, 235 
Zero, 102, 110 
Zero divisor, 356 

left, 356 

right, 356 
Zero map, 376 
Zero matrix, 355 
Zorn’s lemma, 232 
Zorn, Max, 232 


INDEX 


443 


WILEY END USER LICENSE AGREEMENT 


Go to www.wiley.com/go/eula to access Wiley’s ebook EULA. 


